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1 Overview 

The iTAP Switch Element (iTSE) is a communications chip which can be used as a stand-alone 
device to implement a 12 x 12 port switching fabric. When combined with other iTSEs it is possible to 
create larger switch fabrics up to 1728 x 1728 ports for a 5-stage banyan network or up to 572 x 572 
ports for a 5-stage Clos network. 

Each port of an iTSE can simultaneously carry a mixed traffic load of TDM traffic, ATM traffic, 
and Packet traffic. 

The iTAP switch fabric (comprised of one or more iTSEs) will typically be used as the 
interconnect scheme between iTAP Port Processors (1TPP). 

Feature Highlights 

• Synchronous Switching Architecture - All links which interconnect iTSE and ITPP ports are 
synchronized to a common data clock and row start reference. 

• Bufferless Switching Fabric - Packet buffering is implemented via a combination of input and 
J output queues within the fTPPs. 

3 • For TDM traffic, the iTSE will support Time-Space-Time switching. 

=1 • The switching granularity for TDM traffic is at the VT1 .5 level. 

h • For ATM/IP traffic, the fTSE will support a self-routed switching scheme. Since the iTSE will not 

J implement packet buffers, a 2 phase switching algorithm is used. During phase 1 self-routed 

& request messages will be transmitted across the switching fabric in a overlay control channel 

I which matches the data interconnection paths. A "knockout" principle is then used to determine 

j which requests will be serviced. The actual ATM/IP data is then sent through the switch fabric 

during phase 2. Requests not serviced during phase 1 will typically be re-requested during the 

next switch arbitration cycle. 

is 

h • T^e switching granularity for ATM/IP traffic will be 64-byte fixed length PDUs. 

j 1.1 Conventions in this Specification 

I The following conventions are used in this specification: 

j 

]. 1.1.1 Terms and Concepts 

Before proceeding to describe the operations of the iTSE, it will be useful to describe some 
terms and concepts which will be used throughout this document. The terms presented here are 
described in detail in Section 2, they are presented here only in a summary format. 

Port 

A port is physical interface on the iTSE which is used to interconnect to other iTSEs or iTPPs 
to form a switching fabric. The iTSE will have 12 input ports and 12 output ports. Each port 
is comprised of multiple LVDS channels. Specifically, an input port consists of 2 LVDS 
channel inputs and one LVDS channel output while an output port consists of of 2 LVDS 
channel outputs and one LVDS channel input. 

Link 

This is the term used to describe the connection between the output port on one iTSE or ITPP 
and the input port on another iTSE or iTPP. The link physically consists of the circuit board 
wires to interconnect the LVDS channels. A link will always connect the bundle of LVDS 
channels from a single output port to a single input port, i.e., mixing of LVDS channels 
between links is not allowed. The LVDS channel bundle which makes up a link will consists 
of 3 LVDS pairs, 2 pair are used to cany the data traffic and 1 pair is used as an overlay 
network in the reverse path for carrying arbitration grants. 

[LVDS1 Channel 

Individual LVDS channels will be bundled together to form a single link. The terms "LVDS 
channel" and "channel" will be used interchangeably in this document. 

Row 

The synchronous switching mechanism within the iTSE operates on "row" boundaries. The 
term row will be used to describe the traffic which is carried across a link during a single row 



o 4. ir c\r\r\r\ 



* •- — 

Proprietary and Confidential Information of Onex Communications Corporation 



time. Row start times will occur at a 72 kHz rate (13.9 us). A row of data will consist of all the 
data carried on the link during a given row time. The row of data will be byte interleaved 
across multiple LVDS channels in order to achieve an aggregate data rate of 4.4 Gbps across 
the link. 

Frame 

A frame consists of a group of 9 rows which results in a frame rate of 8 kHz. 

Group 

The row structure is subdivided to carry 96 groups. Each group will be comprised of a block 
of 16 slots. The fixed length data PDUs are mapped into these groups (one PDU per group). 
Hie concept of switching a group of data (or a single PDU) is reserved for data carried in a 
single group. 

Slot 

] In addition to subdividing row into groups, the row can also be subdivided into slots. A slot 

j will be 36-bits wide. TDM traffic will be mapped onto the row using the slot terminology, 

jj Switching on a slot basis is reserved for TDM traffic. 

\ PDU 

* Protocol Data Unit. A PDU will be defined for the iTAP chip which will be used to carry either 
h ATM cells or IP Packets. Since the PDU will be fixed to a length of 64-bytes, longer IP packets 
~ will need to be fragmented and carried through the switch fabric on multiple PDUs. 

s 

Speedup 

k Concept where the switch fabric I/O ports each run faster than the external line rate. The 

~ ratio of switch fabric port speed / external line rate is the speedup. Speedup helps reduce 

s input and output blocking through the switch. 

* Strictly Nonblocking 

3 A switch is strictly nonblocking if a connectin can always be set up between any idle input 

I and output without the need to rearrange the paths taken by existing connections. 

Recirculating Buffers 

Used in a switch fabric. If multiple cells are destined to go through the same switch path, 
only one is allowed through and the rest are sent to the recirculating buffer where they willl 
be looked at during the next switch cycle. This is frequently done to support multicasting. 
Also, recirculating buffers are sometimes timestamped so the cell stored in them will be 
discarded if it isn't forwarded withing a given time interval. One thing to watch out for when 
using recirculating buffers is to prevent cell reordering, cells must be forwarded out the 
output port of the switch in the same order they're received at an input port. 
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1.1.2 Byte and Bit Ordering 

Byte order is big-endian, bit ordering is little-endian (LSB is bit 0). This is shown below. 
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Figure 1-1: Byte and Bit Ordering Conventions for this Specification 



1.2 References 

Onex Communications Inter nal Documents - 

[II 

[21 
. [31 

[41 

[5] 

[6] 

[71 

(81 

[91 
[101 

Papers on Switching - 

[111 S. Liew, M. Ng, and C. Chan, "Blocking and Nonblocking Multirate Clos Switching Networks, 
IEEE/ ACM Trans. Networking, vol. 6, no. 3, pp. 307-318, June 1998. 

Nice simple review of Clos netwoks in section II. 

[12| 
[131 



*■ TT nAAA 



Proprietary and Confi dential Information ofOnex Communications Corporation 

1.3 Requirements 

1. RESERVED 




r 3 - 

?: a 



Proprietary and Confidential Information ofOnex Communications Corporation 



2 iTAP Switch Element Functional Description 

This section will provide the architectural overview of the ITSE. The objective here is to 
describe what the 1TSE does, not how it is implemented. Some implementation concepts may be 
expressed in this section to aid in describing what the iTSE does, these implementation concepts may 
not reflect the actual implementation of the iTSE and are not constraints on the implementation 
approach to be chosen. The actual implementation will be provided in later sections of this document. 

2.1 Switch External Interface 

The figure below illustrates the iTAP Switch Element system I/O signals. 



Input Link #0 



DATAJNJV- 

DATAJN_B- 
GRANT OUT- 



Input Link #11 



DATAJN _^A 

DATAJN_f3 

GRANT.OUT^- 



JTAP 

Switch 
Element 



Output Link #0 



DATA-OUT 
-► DATA.OUTJ3 
GRANT JN 



Output link # 1 1 



-► DATA w OUT_A 
DATA_OUT_B 
M \ J GRAOTJN 



Host 
Interface 



Clocks & 
Control 



Figure 2-1: Switch Interface Block Diagram 



Input Link's - 



Each of the 12 Input Links is comprised of 3 highs speed serial LVDS I/O signal pairs. The 
Input Links provide the data traffic input to the iTSE and the grants back to the previous stage. 

The DATAJN.A and DATAJN.B pairs are used together to form a single high speed "logical" 
serial data input stream. This data input stream is used to carry both data traffic and the Request 
Elements which are used for bandwidth arbitration. The GRANT_OUT serial pair provides the output 
control channel for this Input Link. 

All three LVDS pairs associated with an input link will always be connected as a group to the 
Output Link of the source iTSE or iTPP. 

Multiple LVDS pairs must be used to create the single logical high speed Gbps serial stream 
because the maximum speeds currently supported by the candidate silicon vendors are less than 
what is need for an individual data link. 

Output Link's - 

The output link configuration matches the input link configuration. 
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2.2 Serial Link Formats 

This section describes the formats for the serial links which are used to interconnect the 
ITSEs and ITPPs. 

Since the output link of an 1TSE would be directly connected to the input link for the next 
stage iTSE or an output Port Processor, these data structure describe the serial data formats for both 
the input and output links. 

2.2.1 Design Objectives for Sizing Links 

RESERVED. 
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2.2.2 Data Link 

The 4.4 Gbps serial data stream is actually implemented using multiple lower speed serial 
streams. For this discussion, how the 4.4 Gbps serial data is split between the lower speed streams is 
not relevant, always view the serial data associated with an individual link as a single 4.4 Gbps 
stream. 

As shown in the figure below, the serial data link channel is organized into slots, rows and 
frames. A "slot" consists of a 4-bit tag field and a 32-bit data (payload) field. The timing of rows and 
frames is architected to match that of Sonet/SDH frame timing. This will simplify the switching of 
TDM traffic which originates in Sonet/SDH payloads. 

The last 20 slots are reserved for Link Overhead and may not be used to carry TDM or data 
traffic. The frame is transmitted from left to right (slot 0 to slot 1699) and top to bottom (row 1 to row 
9). The msb of each slot is transmitted first. Note that for the data stream, a single PHY channel will 
not be capable of running at 4.4 Gbps, therefore the data stream will be split between two PHY 
channels each running at 2.2 Gbps. If the 1700 slot row is viewed instead as a 7560 byte row, the odd 
bytes will be transmitted on Phy channel A and the even bytes will be sent via channel B. The msb of 
each byte will be transmitted first. The row byte numbering start at 0, thus the last byte of the row is 
byte number 7559. 
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1 frame = 125 us 

1 row = 125 us / 9 = 13.89 us 

1 slot = 4 bit tag + 32 bit payload 

serial bit rate = 1700 slots/row * 36 bits/slot * 9 rows/frame * 8 kHz = 550,800 bits/frame = 4.4064 Gbps 

row size = 1700 slots/row = 7,560 bytes/row = 61 ,200 bits/row 

1 slot bandwidth = slot rate * 36 bits/slot = 72 kHz * 36 = 2.592 Mbps 



Figure 2-2: Data Frame Structure 

For the transport of data PDUs, a block of 16 slots is required. The figure below illustrates one 
example of how the row slots may be allocated for carrying PDUs and Request Elements. As shown, 
the maximum PDU capacity for a row is 96. The term for block of 16 slots which is capable of carrying 
a single PDU is "group". 

Note: Figure 2-3 illustrates Just one partitioning of the row slots into groups and request 
elements, the implementation will allow flexibility in changing the row structure if necessary. 

For each group in the row, 1.5 slots of bandwidth are required for carrying a 48-bit Request 
Element (RE). Figure 2-3 illustrates how 2 REs are inserted into 3 slots within each of the first 24 
groups. All the REs need to be carried within the row as early as possible in order to allow the REs to 
ripple through the multi-stage switch fabric as soon as possible after the start of a row (see the 
Arbitration section for complete details). One option (not shown in this figure) would have been to 
send all 96 REs in the first 64 slots of the row. This is not being done because of the implementation 
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approach for the Arbitration logic which processes the RE. The implementation requires the REs to be 
spaced out in time. The structure shown in Figure 2-3 is considered to be the optimal format given 
system requirements and implementation constraints. 

The row structure will in reality actually be different depending on which link of the switch it 
configured for. For example, lets assume Figure 2-3 defines the row structure between the iTPP and 
the first iTSE of the first switch fabric stage. In this case the first block of 2 REs occupy the first 3 
slots of the row. The implementation of the arbitration logic which processes REs will require at least 
12 slot times of latency between each 3-slot block of REs on the input link. Also, there must be some 
latencey from when the first REs of the row are received to when the REs are inserted into the output 
link t this latency is used by the arbitration logic for mapping incoming REs into the RE buffers. This 
means the row structure for the link between into the second stage will have the first group of REs 
starting at slot time 32. This is illustrated in Figure 2-4. 
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Figure 2-3: Row Structure, Input to Stage 1 
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2 16 slot groups Sequence of 3+8+3+8 slots is repeated 
in first 32 slots 24 times in next 528 slots of the row 



16 slot group, repeated 70 times 
over next 1120 slots 



Figure 2-4: Row Structure, Input to Stage 2 
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2.2.3 TDM Traffic 

RESERVED. 
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2.2.4 Data Traffic 

RESERVED 
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2.2.4.1 Unicast Data PDU Format 

RESERVED. 
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2.3 ITAP System Implementation 

RESERVED. 

2.3.1 Clos Networks 

RESERVED 
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2.3.2 Redundancy 

RESERVED 
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2.4 iTSE Switching Examples 

This section will provide examples which explain how data is sent through a 3-stage iTAP 
switch fabric. The examples will also summarize the control messages required to configure the 
switch connections. 

2.4.1 Single PDU 

The figure below illustrates a typical arbitration and data passing sequence fora data PDU. 
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Request and Grant Elements are each 1 .5 slots. 
PDU is 16 slots. 



Figure 2-5: Single PDU Through a 3-Stage Switch Example 
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Control 6l Confi guration - 

In order to pass a data PDUs through the switch fabric, control messages is not used to set up 
the switch path. Instead, a "self-routed" concept is used. This means each RE, GE, and PDU sent 
through the switch fabric contains a RouteTag which identifies the path through the switch. 

For each received ATM cell or IP packet, the iPP will classify which flow the data belongs to. 
The iPP will contain route tables which will contain the RouteTag field to be used to forward the data 
through the switch. When a new flow is established, only the PP route tables need to be updated, the 
switch doen't require any updates. 
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2.4.2 Single TDM VT1.5 Channel 

The figure below Illustrates a typical TDM data flow through the a 3-stage switch fabric. 
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Figure 2-6: Single TDM VT1 .5 Through a 3-Stage Switch Example 
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3 iTSE Implementation Overview 

A block diagram of the iTSE is shown in Figure 3- 1 . This section will provide a brief overview 
of what each module in the block diagram does. 

3.1 Datapath & Link BW Arbitration (per link modules) 

This group of modules is instantiated 12 times, once for each I/O link the iTSE supports. 
These modules implement the core data path switching functions. A summary of the signals on the 3 
internal datapath buses is shown in Section 3.4. 

3.1.1 Data Stream Deserializer 

The feature highlights of this module are: 

• Synchronize to the incoming serial data stream and then reassemble the row stream which is 
transported using two physical Unilink channels. Provide FIFO'ing on each incoming serial 
stream so that the streams may be "deskewed" prior to row reassembly. 

• Recover the 36-bit slot data from the row stream forward it a third FIFO which will be used for 

J deskewing the 1 2 input links. This deskewing will allow all the input links to forward slot N to the 

switching core simultaneously. The link deskewing is controlled by the Link Synchronization & 
Timing Control module. 

• Continuously monitor the delta between where slot 0 of the incoming row is versus the internal 
row boundary signal within the fTSE. This result will be reported to the Link RISC Processor and 
will be used as part of the ranging process to synchronize the iTPP connected to the input link 
(this would be a first stage function only). 

The detailed description of this module is provided in Section 1 1 . 

3.1.2 Data Stream Demapper 

This module is responsible for extracting the data from the incoming serial data links. The 
feature highlights of this module are: 

• Demapping of the input link slots. This means based on the input slot number determine if the 
traffic is TDM, PDU, or a Request Element. The determination is based on the contents of the 
Demapper RAM. 

• For TDM traffic, determine the destination link and row buffer memory address. This information 
is stored in a Demapper RAM which is configured by software as TDM connections are added or 
torn down. 

• For PDU traffic, assemble all 16 slots which make up the PDU into a single 512-byte PDU. Then 
forward this entire PDU word to the row buffer mapper logic. The PDUs are assembled prior to 
forwarding them to the row buffer so that the row buffer can write the entire PDU to the row 
buffer memory in a single clock cycle. This will provide the maximum possible write-side memory 
bandwidth to the row buffers. This is the most critical constraint of the fTSE implementation, 
being able to write 12 entire PDUs to a single row buffer in 6 link slot times (12 core clock cycles). 

• For Request Elements, assemble the 3-slot block of REs into two 48-bit REs and forward them to 
the Request Parser module. 

The detailed description of this module is provided in Section 4.3.1.2. 

3.1.3 Row Buffer Mapper 

This module is responsible for mapping traffic which is received from the Data Stream 
Demappers into the row buffer memories. The feature highlights of this module are: 

• FIFO the TDM traffic as it's received from the Data Stream Demappers. Then write it to the row 
buffer. The row buffer memory address is actually pre-configured in the Demapper RAM within 
the Data Stream Demapper module. That module will forwarded the address to the row buffer 
mapper along with the TOM slot data. 

• Write PDU traffic from the Data Stream Demappers to the row buffers. The Row Buffer Mapper 
will compute the address within the row buffer where each PDU will be written. PDUs will be 
written into the row buffers starting at address 0 and then every 16-slot address boundary 
thereafter, up to the maximum configured PDU addresses for the row buffer. 
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The detailed description of this module is provided in Section 4.3.1.3. 

3.1.4 Row Buffer 

This module simply contains the row buffer memory elements. The requirements are: 

• Provide double buffered row storage which will allow one row buffer to be written during row N 
while the row data which was written during row N- 1 is being read out by the Data Stream 
Mapper. 

• Each row buffer must be capable of storing 1536 slots of data. This will allow the row buffer to 
store 96 PDUs or 1536 TDM slots or a combination of the two traffic types. Request elements and 
link Overhead slots are NOT sent to the row buffer, therefore the row buffer does not need to be 
sized to accommodate the entire 1700 input link slots. 

• The row buffer write port must be 1 6*36=576 bits wide. It must support writing of only one 36-bit 
slot (TDM data) or writing of an entire 576-bit word (PDU data) in a single clock cycle. 

The detailed description of this module is provided in Section 4.3. 1.4. 

3.1.5 Request Arbitration 

The request arbitration consists of 2 components: (1) a centralized Request Parser module and 
(2) a Request Arbitration module for each of the output links. 

Request Elements are extracted from the input slot stream by the Data Stream Demapper 
modules and then forwarded to the Request Parser. The Request Parser (which is summarized in 
Section 3.2.2) will forward the 48-bit request elements to the Request Arbitration modules via two 
request busses. Each request bus may contain a new request element each core clock cycle. This 
timing will allow the Request Arbitration logic to process all 13 request sources in less than 8 core 
clock cycles. The 13 request sources are the 12 input data streams and the internal Multicast & In- 
Band control messaging module. 

The Request Arbitration module will monitor the two request element buses and read in all 
request elements which are targeted for output link the Request Arbitration module is implementing. 
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4 Requirements for this Request Arbitration module are: 
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• Provide buffering for up to 24 request elements. 

• When a new request element is received store it in a free RE buffer. If there are not any free 
buffers, then replace the lowest priority RE which is already stored in a buffer with the new RE if 
the new RE is a higher priority. If the new RE is equal to or lower in priority than all REs 
currently stored in the buffers then discard the new RE. 

• On the output side, when the Data Stream Mapper module is ready to receive the next RE f 
forward the highest priority RE which is stored in the RE buffers to the Data Stream Mapper 
module. If the RE buffers are empty, then forward an "Idle" RE. 

The detailed description of this module is provided in Section 7. 
3.1.6 Data Stream Mapper 

This module is responsible for inserting data into the outgoing serial data links. The feature 
highlights of this module are: 

• Mapping of the output link slots. This means based on the output slot number determine if the 
traffic is TDM, PDU, Request Element, or test traffic. The determination is based on the contents 
of the Mapper RAM. 

• For TDM traffic, determine the row buffer memory address. This information is stored In a 
Mapper RAM which is configured by software as TDM connections are added or torn down. 

• For PDU traffic read one slot at a time from the row buffer. Hie row buffer memory address is 
stored in the Mapper RAM by software. If the target PDU is not valid (i.e. t a PDU was not written 
to that row buffer location during the previous row time), then transmit the idle pattern, this will 
insure that a data PDU is not duplicated within the switch. 

• For Request Elements, assemble the 3-slot block of REs from two 48-bit REs. The REs are read 
from the Request Arbitration module. 

• For test patterns, insert the appropriate test pattern from the Output Link Bus. These test 
patterns are created by either the Test Pattern Generator or Test Interface Bus modules. 
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• Support slot multicasting at the output stage. For example, if we're the Data Stream Mapper for 
output link 3, we will be able to copy whatever any other output link Is sending out on the 
current slot time. This copying is controlled via the Mapper RAM and will allow the Mapper to 
copy the output data from another output link on a slot-by-slot basis. 

The detailed description of this module is provided in Section 4.3.1.5. 

3.1.7 Data Stream Serializer 

The feature highlights of this module are: 

• Create the output slot stream, data slots are received via the Data Stream Mapper module, 
overhead slot data is generated internally to this module. 

• Split the row data stream into two byte streams for transmission on two Unilink drivers. 

• Scramble the output byte stream 

• Serialize the output byte stream 

p The detailed description of this module is provided in Section 11. 

^ s 3.1.8 Grant Stream Deserializer 

hi The Grant Stream Deserializer works in much the same manner as the Data Stream 

\j Deserializer, The primary difference is that the grant data only utilizes a single Unilink receiver, thus 
eliminating the need for deskewing and deinterleaving to recover a single input serial stream. 

JS Since this serial link will only be one half the data stream rate, there will only be 850 slots per 

Sj row time. 

s A single FIFO is used to allow for deskewing of the input serial grant streams for all 12 links. 

f ** The detailed description of this module is provided in Section 11. 

fy 3.1.9 Grant Stream Demapper 

f"* This module is responsible for extracting the data from the incoming serial grant links. The 

£j feature highlights of this module are: 

• Demapping of the received grant link slots. This means based on the input slot number 
determine if the traffic is a Grant Element or another kind of traffic. The determination is based 
on the contents of the Grant Demapper RAM. Note: Traffic other than Grant Elements is TBD. 

• For Grant Elements, assemble the 3-slot block of GEs into two 48-bit GEs and forward them to 
the Grant Parser module. 

The detailed description of this module is provided in Section 7.2.3.1. 

3.1.10 Grant Arbitration 

The grant arbitration operates in an identical manner to the Request Arbitration logic. In fact, 
this module if the identical to the Request Arbitration module, the only difference is that it's 
processing grant elements in the reverse path instead of request elements in the forward path. 

3.1.11 Grant Stream Mapper 

This module is responsible for inserting data into the outgoing serial grant links. The feature 
highlights of this module are: 

• Mapping of the output grant slots. This means based on the output slot number determine if the 
traffic is a Grant Element or test traffic. The determination is based on the contents of the Grant 
Mapper RAM. 

• For Grant Elements, assemble the 3-slot block of GEs from two 48-bit GEs. The GEs are read 
from the Grant Arbitration module. 

• For test patterns, insert the appropriate test pattern from the Output Link Bus. These test 
patterns are created by either the Test Pattern Generator or Test Interface Bus modules. 

The detailed description of this module is provided in Section 7.2.3.2. 

3.1.12 Grant Stream Serializer 
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The Grant Stream Serializer works in much the same manner as the Data Stream Serializes 
The primary difference is that the grant data only utilizes a single Unilink transmitter, thus 
eliminating the need for interleaving the transmit serial stream across multiple output serial streams. 

Since this serial link will only be one half the data stream rate, there will only be 850 slots per 
row time. 

The detailed description of this module is provided in Section 1 1. 

3.2 Datapath & Link BW Arbitration (per chip modules) 

This group of modules is instantiated only once in the iTSE. These modules provide support 
functions as part of the implementations of the core data path switching functions. 

3.2.1 Link Synchronization & Timing Control 

This module provides the global synchronization and timing signals used in the iTSE. Some of 
its features are: 

• Generate transmission control signals so that all serial outputs start sending row data 
synchronized to the RSYNC (row synchronization) input reference. 

• Control the deskewing FIFOs in the Data Stream Deserializers so that all 12 input links will drive 
the data for slot N at the same time onto the input link bus. Note: this same deskewing 
mechanism is implemented on the Grant Stream Deserializers. 

The detailed description of this module is provided in Section 10. 

3.2.2 Request Parser 

This module will receive inputs from all 13 request element sources and forward the REs to 
the Request Arbitration modules via two request element buses. Basically, this module is mapping the 
13 parallel RE inputs onto two TDM buses. 

The detailed description of this module is provided in Section 7.2. 1. 1. 

3.2.3 Grant Parser 

The grant arbitration operates in an identical manner to the request arbitration logic. In fact, 
this module if the identical to the Request Parser module, the only difference is that its processing 
grant elements in the reverse path instead of request elements in the forward path. 

3.2.4 Link RISC Processor 

This module will be a Tensilica processor core (one of two on the iTSE) which will implement 
these functions: 

• Control the ranging synchronization on the input links with the source iTPP. This function only 
needs to be done on an iTSE which resides within the first stage of the switch fabric. 

• Likewise, control the ranging synchronization on the output link grant stream input with the 
source iTPP (Lie. flPP generating the grant stream). Hiis function only needs to be done on an 
iTSE which resides within the last stage of the switch fabric. 

• Multicast controller. Hiis link RISC Processor will handle the Req/ Grant processing needed to 
transmit multicast messages. 

• In-band data communications controller. This module will only control the reception and 
transmission of the in-band communications PDUs. All PDUs will be forwarded to the 
Configuration RISC Processor which will interpret the messages. This Link RISC Processor will 
only handle the Req/ Grant processing needed to transmit messages. 

The detailed description of this module is provided in Section 6. 

3.3 Support Modules 

This group of modules is instantiated only once in the iTSE. Since the primary function of the 
iTSE is switching of traffic between the input and output links, we can view modules which are not 
actively transporting traffic as "support" modules for the switching functions. 

3.3.1 Configuration RISC Processor 
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This Is a Tensilica based RISC processor core, it is one of two Tensilica RISC processors which 
is present in the iTSE. The primary function of this core is to process configuration and status 
messages from an external (to the iTSE) controller module. 

The detailed description of this module is provided in . 

3.3.2 System Control 

This module will handle all the reset inputs and reset the appropriate internal modules. 
The detailed description of this module is provided in Section 12. 

3.3.3 Test Pattern Generator & Analyzer 

This module will be used for the generation of various test patterns which can be sent out on 
any slot on the Data Stream or Grant Stream outputs. It will also be capable of monitoring input slots 
from either the received Data Stream or Grant Stream. 

The detailed description of this module is provided in Section 17.1. 

3.3.4 Test Interface Bus Multiplexer 

This module will allow for sourcing transmit data from the external I/O pins. Also, received 
data can be forward to the I/O pins. This will be used for testing the iTSE when an iTPP may not yet 
be available. 

The detailed description of this module is provided in Section 17.2. 



The Unilink PLL is used to create the IF clock needed by the Unilink macros. Within each 
Unilink macro another PLL will multiply the IF clock up to the serial clock rate. 

The Core PLL is used to create the clock used by the iTSE core logic. This core clock is 
expected to be around 250 MHz. 

The detailed description of these PLLs is provided in Section 9. 



The JTAG interface is used for two purposes: (1) boundary scan testing of the iTSE at the 
ASIC fab and (2) Debug interface for the Configuration RISC Processor. Note: the Link RISC Processor 
will not have a debug interface, it will be implementing finite state machines so we want to keep it as 
small as possible. 

The detailed description of this module is provided in Section 17.4. 



3.3.5 PLLs 



3.3.6 JTAG 
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3.4 Internal Datapath Buses 

RESERVED 
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4 Data Path Description 

This section will describe how the switch data path is implemented within an iTSE. There are two types of data 
which may be switched through the iTSE, TDM data and Data PDUs (which can carry an ATM cell or a fragment of an IP 
packet). This section will focus on TDM and unicast Data PDU switching. Multicast Data PDU switching is described in 
Section 5. 

The switching mechanism will operate on a row by row basis. This means TDM or Data traffic on an input link 
during any given row time may be switched to any output link and/or slot time within the same given row time. The 
structure of the iTSE is a Time-Space-Time division fabric. This concept is illustrated in Figure 4-1. 
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Figure 4-1: iTSE Tirae-Space-Time Switching Fabric 

The T-S-T switching concept only applies to TDM traffic. For data PDUs, the implementation of the iTSE allows 
the data PDUs to reside in any group number within the row. This means that for data traffic the iTSE switch fabric can 
simply be viewed as a Space switch. 

Since the iTSE switches traffic on a row by row basis, a 2-Row Buffer store is required. While one row is being 
received and written into one of the 2 row buffers, the other row buffer (which contains data received during the previous 
row time) is being played out to the output link. 

TDM traffic is switched based on how the Input Link Demapper RAM, of which there is one per input link, and 
the Output Link Mapper RAM, of which there is also one per output link, are configured. Reference Figure 4-16 and Figure 
4-19 for where these RAMs are implemented. These RAM are configured via the internal RISC processor, which in turn 
gets the configuration messages from an external software module which is determining the switching path for each new 
TDM stream which is added to the switch fabric. 

Data traffic is switched as 16-slot "PDUs" through the switching fabric. Each PDU will include a self-route tag 
which identifies the path it will take through the switching fabric. Data PDUs enter the switching fabric based on 
scheduling algorithms which are running on each input Port Processor chip. Since these scheduling algorithms are 
independent, there is not any synchronization between the iTPPs. This could result in the iTPPs sending more data PDUs to 
a single output row buffer than it can store, which would result in the switch dropping the excess PDUs. In order to prevent 
this situation from occurring, the concept of "arbitration" for the PDU data path has been introduced to the iTSE 
architecture. With this arbitration scheme, the source iTPPs will send request messages to the destination iTPPs. The 
destination iTPPs will in turn reply with a grant message, via a separate out-of-band control channel which is implemented 
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as a full reverse overlay network within the switch fabric, to the source iTPPs. If the source iTPP receives a grant for a 
request, it will be able to send the data PDU in the next row and be certain that the PDU will get to the destination iTPP and 
not be lost within the switch fabric due to buffer overflow. 

This arbitration scheme is described in detail in Section 7. This arbitration scheme, in concert with the 
implementation of the data path, will guarantee that no data is lost within the switch fabric. 

4.1 Constraints on Switch Configuration 

This section summarized the constraints on iTSE usage which, if met, will guarantee that no data is lost within the 
switch fabric. 

TDM Switc hing Constraints - 

• TDM traffic may not be transported on the last 36 slots of the row. 

• In any given slot, no more than 2 TDM slots on any of the 12 input links may be destined for the same row buffer input 
TDM FIFO (see Figure 4-16 for where the TDM FIFOs reside). 

• In any given 16-slot period, no more than 18 total TDM slots may be destined for the same row buffer input TDM 
FIFO. 



4.2 iTSE Data Path Timing 

Figure 4.2 illustrates the basic timing for writing incoming data into the Row Buffers. Writing of incoming data to 
the Row Buffers is the critical timing path because it's possible that as many as 14 traffic sources will simultaneously send 
traffic destined to the same output Row Buffer. The 14 traffic sources are the 12 Input Link Demappers, the Multicast 
Controller, and the Control Message Controller. 

The timing concept will operate on a "group" basis, where a group will define the 16-slots which may contain a 
data PDU. The input links associated with a specific switch stage must all have the same configuration for data PDU slots. 
This means that the data PDU slots must all be defined to line up on the same slot boundaries. This definition is done by 
was of the Input Link Demapper RAM configuration. 

For example, let's say input links 0, 1, and 2 are all configured to be within the same switch stage. In addition, 
let's say that link 0 contains 3 PDUs, link 1 contains 1 PDU, and link 2 contains 2 PDUs. One possible configuration for 
where the PDUs are placed on each of these 3 links would be: 

• Link 0 - 1st PDU in slots 2 through 17, 2nd PDU in slots 20 through 35, 3rd PDU in slots 36 through 51. 

• Link 1 - 1 st PDU in slots 20 through 35. 

• Link 2 - IstPDU in slots 2 through 17, 2nd PDU in slots 36 through 51. 

Observed that the 16-slot PDUs for all input links must always fall within the same link slot boundaries. Links which 
carry fewer PDUs may use the unused PDU slot areas for carrying TDM traffic. 

The 16-slot group time period is then divided into two half-group periods, which are called "1st 1/2 group period" 
and "2nd 1/2 group period". During group number N, if an incoming link is carrying a data PDU, the 1st half of the data 
PDU will be assembled into one large 256-bit word (8 slots * 32 bits/slot). Half way into group number N, the Row Buffer 
Mappers will be given a start signal which tells them that they may now write the first half of the PDUs into the upper part 
of the row buffers. This row-buffer write period may take up to 12 cclk cycles. While the 1st half of the group is being 
written to the row buffers, the 2nd half of the group is being assembled into another larger 256-bit word (8 slots). At the 
end of group number N, the 2nd half of the PDU will be written into the lower part of the row buffers. This timing is 
illustrated in Figure 4-2. 

If an incoming link is carrying TDM data, the TDM slot data will be written directly into the appropriate TDM 
FIFO. The contents of the TDM FIFOs are then written to the row buffers during periods where PDU data is not being 
written. 

The internal core clock for this datapath logic will be running at 2x the slot rate, i.e., there will be 2 clock cycles 
available to process each received slot of data. Thus, on average, 7 slot times (14 cclk cycles) are used to write up to 14 
PDUs of data to the Row Buffer RAM. The remaining 9 slot times (18 clock cycles) will be used to unload the TDM FIFOs 
and write their contents to the Row Buffer RAM. 

This timing is illustrated in the figure below. Because there are only 2 core clock cycles available per incoming 
slot time, only 2 TDM slots of data may be written to any single TDM FIFO in that single slot time. This will be a 
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constraint that the user must make sure not to violate when setting up TDM paths through a switch fabric. Another 
constraint for TDM traffic is that in any given 16-sIot group period, a maximum of 18 TDM slots are written to any single 
TDM FIFO. 
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For 1st 1/2 of Group: 

If TDM write to TDM FIFOs. 

If PDU assemble 1st 8 slots. 

For 2nd 1/2 of Group: 

If TDM write to TDM FIFOs. 

If PDU assemble last 8 slots. 



For Wrist 1/2 Group: 

Transfer TDM FIFO contents to lower 1/2 of Row Buffer (data input 287:0). 
Writelst half of PDUs to upper 1/2 of Row Buffer (data input 575:288). 

For Wr 2nd 1/2 Group: 

Transfer TDM FIFO contents to upper 1/2 of Row Buffer (data input 575:288). 
Write 2nd half of PDUs to lower 1/2 of Row Buffer (data input 287:0). 

When writing to the row buffers, you're writing the TDM or PDU data which was 
received during the previous 8 slots. 



Note: Because we're illustrating timing forwrhing to the row buffers here, RE slots are not present. 

Figure 4-2: Input TDM & PDU Dataflow Timing 

Note: In this figure, we don't show the Request Element slots. Normally these RE slots would be inserted in slots 
early in the row, this is shown in Section 2.2.2. The purpose of this figure is to illustrate the worse timing scenario for 
writing PDU and TDM data to the row buffers. 
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4.3 Datapath Core Implementation 

A block diagram of the data path core of the iTSE is shown below. 



3 
1 

8 

cc 
c 



. TDM slots to 
request test an^r 
inout . \ 

t J 0 

npulLink#Q | r \ >r 

Input Link I I idfi 

Demapper t 



Input Unk #0 



tdmdata 
pdu data ' 



request 
input 



Input Link #1 



1 



Input Link 
Demapper 



pdu only d 



Output Unk #0 



Egress MC PDU 



Output Unk #11 
I ■ 



Egress MC PDU 



Egress Ctrl Messages 



MC & Ctrl Msg Ingress PDUs 
to Multicast Controller buffer 



Idle Pattern, Request 
Test Patterns, Output, 
Unk OH Unk#0 



Row 




Buffer 




Mapper 


»» 



Row Buffer RAM 
(store 2 rows) 



Row 




Buffer 




Mapper 


». 



Row Buffer RAM 
(store 2 rows) 







Output 
Link 






Mapper 



•••i 



From other 1 1 To other 1 1 
output links output links 

For TDM Multicast 



Request 
Output, 
Unk #11 







Output 
Unk 




Mapper 



• • • 



From other 1 1 
output links 



To other 1 1 
output links 



For TDM Multicast 



S 

<D 
£= 

1 
CI 

c 
.2 



fi- 
o 



Figure 4-3: Switch Datapath Block Diagram 



This Datapath core will be implemented with the hierarchy shown in Figure 4-4. 
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Figure 4-4: Datapath Module Hierarchy 
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4.3.1 Datapath Link Module 

A Datapath Link module is comprised of the modules which will make up a single switch datapath link. The iTSE 
Datapath module will instantiate 12 of these Datapath Link modules. 

As shown in the Figure 4-5, a Datapath Link module will instantiate these modules: 

• Input Link Demapper 

• Row Buffer Mapper 

• Row Buffer RAM 

• Output Link Mapper 

• Datapath Link CSR 
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Figure 4-5: Datapath Link Module Interfaces 



4.3.1.1 Interface I/O Signals & Timing 

Because there are a large number of I/O ports associated with a Datapath Link module we ? ll summarize them in 
the table below so that the reader has a feel for the I/O signals prior to describing the implementation of each of the 
modules which make up a Datapath Link. Figure 4-5 illustrates the interfaces of a Datapath Link module. 



Ingress T raf fic Timing - 

Figure 4-6 illustrates the timing for incoming data slots. The iTSE implementation will require cclk to be at least 
twice the incoming slot rate. This will insure that the iTSE will have a minimum of two cclk cycles to process each 
incoming slot of data. The two cclk cycles per slot are split into two phases in order to identify which cclk edge the 
incoming islot_num and islot_data signals are changing. 

The iTSE will allow cclk to be asynchronous to the serial link clocks, the only requirement is that cclk be chosen 
such that there are always a minimum of two cclk cycles per slot. Because cclk may be asynchronous to the incoming link, 
there may more that 2 cclk cycles for any given input slot. In this case both islot_phaseO and islot_phasel will both be 
deasserted for all but the first two clock cycles for each input data slot. This case is is shown during islot_num = 2 in Figure 
4-6. Whenever an extra timing adjust clock cycle is inserted, the signal islot_phasez will be asserted. 



Notes: 

• The islot_row_end would normally occur during the last slot of the link. But we do have the option of speeding up the 
link (for characterization in the lab), which would result in the link having more than 1700 slots per row. In this 
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scenario, islot_row_end will be asserted starting at slot 1699 and remain asserted until slot 0 of the next row. 
• If link synchronization is lost, islot_phaseO and islot_phasel will remain deasserted and islot_j>hasez will be asserted. 
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Figure 4-6: Ingress Traffic Timing 



Input Bus T iming ■ 

For the input bus we'll show these timing diagrams: 

• Driving received TDM slots onto the Input Bus. 

• Driving received PDUs onto the Input Bus. 

• Driving received REs onto the Input Bus. 
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Figure 4-7: Input Link Bus, TDM Timing 
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Figure 4-8: Input Link Bus, PDU Timing 
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demapper.ram, re1 , and re2 are signals internal to the Input Link Demapper module. 
Dink_req_start is generated by the Datapath Control module. 



Figure 4-9: Input Link Bus, Request Element Timing 
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4.3.1.2 Input Link Demapper 

This module is instantiated once per Link module. A high-level block diagram is shown in Figure 4-10. 
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Section 4.3.1.1 



Figure 4-10: Input Link Demapper Block 
The inputs to this module are the CPU Bus and the Ingress Traffic interface. These I/O interfaces are described in 



4.3.1.2.1 Demapper RAM 

The Demapper RAM determines how each slot on the input link is being used. The RAM will be configured by 
the CPU to do two primary functions: 

• Define the structure of the input link. This means identifying whether the input slot is carrying TDM traffic, a portion 
of a data PDU ( a portion of a Request Element, a Test Traffic slot, or is slot which should be ignored. 

• For TDM traffic, it also specifies the destination row buffer and the address within that row buffer. This is the Time- 
Space cross-connect mapping function. 

The islot_num input is incremented once for each incoming slot and will be used as the address for this RAM. 
This input slot number will start at 0 for the first slot and then simply be incremented once for each new input slot up to the 
maximum number of slots in the row. 

Since there will always be at least 2 cclk cycles per input slot, the accessing of the RAM is split into two phases. 
During phase 0 the RAM will be addressed using the islot_num input, the output of the RAM will be registered at the end 
of phase 0 so that it will remain valid for the following 2 cclk cycles. During phase 1, the RAM may be written to or read 
from by the CPU. This timing is illustrated in Figure 4-11. The RAM will be implemented with a 1 write, 1 read port 
memory cell. 
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Figure 4-11: Deraapper RAM Timing 



The Demapper RAM is 16-bits wide. The first bit identifies whether or not the slot contains TDM traffic. If so the 
remaining 15-bits identify the destination row buffer and address within that row buffer. If the slot is not TDM traffic, then 
the remaining 15-bits are used to identify what the slot could be used for. This structure of the RAM is shown in the figures 
below. 





15 






10 


0 


TDM Slot 


1 


DestLink 


DestAddr 




DestLink, 4 bit field which specifies 
the destination link for this TDM slot. 


DestAddr, 1 1 bit field which specifies the address 
in the destination row buffer this TDM slot 
wilt map to. 
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Non-TDM Slot 


0 


Trafficld 


contents depends on Trafficld 



Trafficld 


Description 


000 


Expansion (bits 1 1 :0 identify traffic) 


001 


Request Element 


01 0 


PDU 


01 1 


Test Analyzer 1 


100 


Test Analyzer 2 


101 


Test Analyzer 3 


1 1 0 


Capture Register 


1 1 1 


Multicast Request Element 



Figure 4-12: Demapper RAM Structure, TDM vs Non-TDM Slot 
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15 
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1 0 


0 


001 


these bits ignored by HW 





RegSel 


Description 


00 


RE#1 bits 53:18 


01 


RE #1 bits 17:0, RE #2 bits 53:36 


1 0 


RE # 2 bits 35:0 


1 1 


reserved 



Figure 4-13: Demapper RAM Structure, Request Element Slot 
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PDU Slot 


0 


010 


these bits ignored by HW 


PduSIotNum 



PduSlotNum 


Description 


0000 


PDU bits [31:0] 


0000 


PDU bits [63:0] 






1111 


PDU bits [51 1 :480] 



Figure 4-14: Demapper RAM Structure, PDU Slot 
Table 4-1 : Expansion Slot Definitions 









0 


idle 


Unused ingress slot, will be ignored by Demapper logic. 


1 


BIP-36 


Bit Interleaved Parity for alt slots from slot 0 or previous BIP-36 slot. 


2 


LOH Status 


Link Overhead slot which contains status information fromthe remote con- 
nection^ bits 28:24. 


2-4095 




reserved for future use 



4.3.1.2.2 TDM Assembler 

The TDM Assembler module simply latches both the TDM slot data (islot_data) and the output of the Demapper 
RAM when the incoming slot is a TDM slot. It then generates signals (ilink_tdrn_dv_flag[47:0]) which inform the Row 
Buffering logic that TDM data is available. 

The timing diagram below illustrates the receiving of 4 TDM slots on the input link. Slot numbers N, N+l, N+2, 
and N+4 are the TDM slots. 
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Also, non-TDM slots which don't fall into the Request Element or PDU category will also be driven out on this 
TDM bus. The Trafficld bits are decoded be driven out on the ilink_tslot_flag[3:l] ports. 

4.3.1.2.3 Request Element Assembler 

Because the request element (RE) size is 54 bits, it cannot fit into a single slot. Therefore, in order to minimize the 
number of slots required to carry up to 96 REs per row we'll define a structure which will pack 2 REs into 3 slots. The 
request element assembly block will contain 2 54-bit buffers which will hold the two REs from the 3 slot structure. 

As shown in Figure 4-13, the 2 Lsb's of the Demapper RAM will specify which buffer register the current 36-bit 
slot will be written to. In this timing example of Figure 4-9, the 3 slots which contain the request payloads are slot numbers 
N, N+l, and N+2. The ilink_req_start output will inform the request arbitration logic that it can now start processing the 
new request element which is available on the ilink_req bus. 



4.3.1.2.4 PDU Assembler 

Data PDU slots will be forwarded to the PDU assembler where they will be assembled into a 256-bit wide words 
(8 slots * 32 bits/slot). This 8 slots will be the two halves of a 16-slot PDU. After each half-PDU is assembled, the 256-bit 
half-PDU is available to the row buffer memories so that the entire half-PDU can be written into the row buffer RAM in an 
single write cycle. 

The PDU assembler must be able to store up to 14 slots of data, 8 slots for the half-PDU, plus another 6 slots to 
store the first 6 slots of the next incoming half-PDU while the main half-PDU store buffer is waiting for the row buffer 
RAM to write the half-PDU to the memory. 

The appropriate ilink_pdu_flag is asserted for the output link this PDU should be forwarded to. For each PDU 
received on the input link, the flag to set is determined by the appropriate 4-bit field from the self-route tag in the PDU 
header. The stage_num control input determines which 4-bits to use from the self-route tag. This ilink_pdu_flag field is one 
bit per output link (as opposed to using a 4 bit encoded field) so that the Row Buffer Mapper doesn't need to decode a 4-bit 
field. 



The mapping of input PDU slots to the 512-bit complete PDU is shown below: 





511 


495 


479 




31 


15 0 




PDU.O 


PDU.1 


PDU.2 


• • • 


PDU.1 4 


PDU.15 




The PDU slots are received on the input link in the order of PDU.O to PDU.15. 







Figure 4-15: PDU Format 



Parity Chec k- 
RESERVED 



4.3.1.2.5 Capture Register 

There will be a single 32-bit register which can be used to capture any slot of data from the incoming link. The 
CPU will be able to read this register at any time. Whenever the capture register is written, the new value will be compared 
against the previous value and if any bit is different, a LinkCaptureReg IRQ will be generated. 

It is expected that this will normally be used to read the Link Overhead Mailbox slot. 

In addition to capturing the 32-bit LOH mailbox slot, this module will also have a 4-bit register to capture the 
LOH status slot. The only bits capture off this status slot are bits 28:24. As is done with the capture register, if any of the 4 
bits is different than the previous values, a LinkSyncMsg IRQ will be generated. It is expected that this LOH status slot will 
contain status information fron the remote device connected to this incoming link. 
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4.3.1.3 Row Buffer Mapper 

This module controls the writing of data from the input links to the row buffer. There are two types of data which 
are written to the row buffer, TDM slots and DATA PDUs. The TDM slot data must be written to the row buffer on a slot by 
slot basis, i.e., no dependencies between TDM slots is assumed. The DATA PDU is 16-slots wide and in order to keep up 
with the incoming data rate, must be capable of being written to the row buffer in a single clock cycle. 
A high-level block diagram is shown in Figure 4-16. 



Design Considerations - 
RESERVED 
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ilinkjdmdata J) 



ilinkjdmdata_11 



ilinkjdmdata fields: 
36 bits = slot data 

7 bits = RAM addr 
2 bits = write port 



ilink_tdmdata_0 



ilinkjdmdata 11 



ilinkjdmdata_0 



ilinkjdmdata 11 



ilinkjdmdata_0 



ilinkjdmdata J 1 



ilink_pdu field: 
256 bits = 1/2 group = 8 32-bit slots 





TDM FIFO 





32x45 



»» 


TDM FIFO 




32x45 






TDM FIFO 




32x45 





TDM FIFO 


»- 


32x45 



ilink_pdu_0 
ilink_pdu_13 



K 



slots[7:0] 



9 RAM addr, 
-f — ► write port 



Row Buffer RAM 
96x144 x2 for double buffer 



slotO 






slot 1 


Write 


Read 




Port 


Port 


slot 2 






slot 3 







9 RAM addr, 
■/• — ► write port 



=2> 



=3> 



Row Buffer RAM 
96x144x2 for double buffer 



slot 4 






slot 5 


Write 


Read 




Port 


Port 


slot 6 






slot 7 







9 RAM addr, 
■f — ► write port 



Row Buffer RAM 
96x144 x2 for double buffer 



slot 8 




s,ot9 Write 


Read 


Port 


Port 


slot 10 




slot 11 





RAM addr, 
■f — >■ write port 



3> 



Row Buffer RAM 
96x144 x2 for double buffer 



=3> 



slot 12 




slot13 Wrjte 


Read 


Port 


Port 


slot 14 




slot 15 





Figure 4-16: Row Buffer Mapper (data path only) 
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4.3.1.4 Row Buffer 

The double-buffered row buffers are used to store the traffic for the output link on a row-by-row basis. This means 
traffic received on one of the 12 input links during row N will be stored in a row buffer and will then be transmitted on the 
output link during row N+l. 

The row buffers are "double-buffered" in order to support the simultaneous reception and storage of traffic being 
received during the current row period and the transmission of traffic which was received during the previous row period. 
At the row boundary (as indicated by the row.toggie input signal) the row buffers will swap. 

The row buffers are sized to be 96 PDUs deep. Since each PDU is 16 slots wide, the total storage capacity for a 
single row buffer is 1536 slots. The observant reader will realize this slot capacity is not large enough to hold an entire row 
of data from the input link. The input link has a capacity of 1700 slots, but we're sizing the row buffer to only hold 1536 
slots. The reasoning for this is: 

• 20 slots of the input link are the link Overhead which doesn't need to be stored in the row buffer. 

• If the link is configured to carry 96 PDUs of data traffic, then 144 input link slots will be required to cany the request 
elements. Request elements are not stored in the row buffers, they are forwarded directly to the arbitration logic by the 
Input Link Demapper module. 

The block diagram of the row buffer is shown in Figure 4-18. The row buffer must be capable of writing an entire 
PDU (16 slots) of data in a single clock cycle, therefore multiple memory elements must be used in order to get this data 
bus width. 

The address organization of the 96x576 row buffer is shown below: 
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Figure 4-17: Row Buffer Memory Addressing 
PDUs will always be stored on an address boundary which is a multiple of 16. 



4.3.1.4.1 PDU Fill Status Logic 

The function of this module is to monitor the reads from the row buffers by the Output Link Mapper module and 
indicate to whether or not the an address location being read contains valid PDU data. A location contains valid PDU data 
if it was written to during the previous row time. 

When the Output Mapper logic reads a PDU location which does not contain a valid PDU, it will transmit the idle 
pattern in place of the PDU slots. 
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' rbuf_dout[35:0] 



This block is instantiated 12 times, 
once for each output link. 



Figure 4-18: Row Buffer Block Diagram 
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4.3.1.5 Output Link Mapper 

The Output Link Mapper module is responsible for mapping the transmit pay loads onto the outgoing link slots. 
The sources of internal data which may be mapped onto an output link slot are: 

• TDM or PDU data from the row buffer memory. 

• Request Elements from the arbitration module. 

• A copy the outgoing slot data from one of the other 1 1 links. This is the way multicasting of TDM data will 
accomplished. 

• Idle patterns for unused slots. 

• Test traffic from the I/O pins or internal test traffic generators. 

This module is instantiated 12 times, once for each output link. A block diagram is shown in the figure below. 

The module basically consists of two main functional blocks: (1) a Mapper RAM which identifies what data 
should be placed into each output link slot and (2) logic which multiplexes the various internal data sources onto the output 
link. 



cpu.bus 
oslot_num[10:0l 

olink^req[47:0] _^ 

test_srcl[35:0] . 
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(to other 1 1 link mappers) 
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Figure 4-19: Output Link Mapper 



Implementation Notes - 



Loop back omap_data output to omap_data input so have all 12 links on omap_data in, this will make 
programming for TDM multicast in Mapper ram easier (will have a 12:1 mux vs an 11:1 mux). 

Egress T raf fic Timing • 

Figure 4-20 illustrates the timing for outgoing data slots. The iTSE implementation will require cclk to be at least 
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twice the outgoing slot rate. This will insure that the iTSE will have a minimum of two cclk cycles to process each outgoing 
slot of data. The two cclk cycles per slot are split into two phases in order to identify which cclk edge the oslot.num and 
oslot_data signals are changing. 

The iTSE will allow cclk to be asynchronous to the serial link clocks, the only requirement is that cclk be chosen 
such that there are always a minimum of two cclk cycles per slot. Because cclk may be asynchronous to the outgoing link 
there may more that 2 cclk cycles for any given output slot. In this case both oslot_phaseO and oslot.phasel will both be 
deasserted for all but the first two clock cycles for each input data slot. This case is is shown during oslot.num = 2 in Figure 
4-20. Whenever an extra timing adjust clock cycle is inserted, the signal oslot__phasez will be asserted. 



oslot_phaseO 
osIol_phasez 




oslot_num[10:0] X~"l699"^ ( 0 ^ 1 X j X TY <HX ~ 

rowjoggle | " " 

oslot_data[35:0] )Lj697jQj|0_j6^ 2 "X 3 



Figure 4-20: Egress Traffic Timing 



4.3.1.5.1 Mapper RAM 



RESERVED. 



4.3.1.5.2 Output Link Request Element 

When Mapper RAM indicates it is time to transmit a request element, the highest priority 52-bit request element 
is fetched from the arbitration module. Since only 36-bits of a RE may be transmitted in a single link slot, it will be 
necessary to add buffering within the Output Link Mapper module which will buffer the extra bits which cannot be sent in 
the current slot. 

A 3-slot structure will be defined which will be used to transmit 2 52-bit REs and their associated 2-bit BIP2 
parity. The contents of the 3 slot structure is shown in the Req Element Slot RAM structure in Figure 4-21. 

The timing for fetching REs is shown below. In this example, the 3 slot RE structure is programmed to be 
transmitted on output link slots N, N+l, and N+2. 

The Output Link Mapper module will be responsible for prepending the 2-bit BIP2 parity in bit positions 53 and 
52 to create a 54- bit request element. 
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cdk 
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oslot_num[10:0] ~ K N X N+1 jOHFT " 
mapper_ram[15:0] 
oltnk_re_rd 
oltnK_req[47:0] 



1 I L 




osloLdata[35:0] 



RE.0 = re1 [53:18] 
RE.1={re1[17:0], 
RE.2 = re2[35:0] 



re2[53:36]) 



Figure 4-21 : Request Element Timing 

As we can see from this timing diagram, the RE (olink_req) must be available during the same cclk cycle as 
olink_re_rd is asserted. This means the Request Arbiter module will need to implement a "pre-fetch" mechanism for the 
outgoing request elements. 

4.3.1.5.3 Output Link Overhead 

There will normally be 20 slots used for Link Overhead (LOH). The mapper module will be responsible for 
inserting contents of these 20 LOH slots into the link data stream. There are 4 types of data which may be inserted into the 
LOH slots: 

LOH F ranting P attem - 

This will be a 36-bit value which is common to all output links. It will be Configurable via a software 
programmable register. This pattern will be used in only 1 of the 20 LOH slots. 

LOH Status ■ 

This 32 bit status field will contain only a single bit of status information. In bit 24 the synchronization status of 
the Grant channel for this link will be carried. All other bits will be fixed at 0. 

Note: the 4 tag bits are fixed to all 1 *s. 

LOHIdentifi er- 

This 32-bit will contain an identifier for this switch & link. The field is made up as: 

• loh_id[3:0] = link number that the output mapper is instantiated as. 

• loh_id[27:4] = iTSE ID number which is S W configurable (via switchjd register in the RISC core). 

• loh_id[3 1 :28] = stage number the iTSE is programmed as. 
Note: the 4 tag bits are fixed to all 1 *s. 

LOHStuf f- 

This 32-bit pattern will be inserted in the LOH slots which aren't used for framing, status, or ID. This pattern will 
be Configurable via a software programmable register and is common to all output links. 

Note: the 4 tag bits are fixed to all 1 *s 

LOHMailbo x- 

This 32 bit mailbox will be configurable via a software programmable register. There will be a unique mailbox 
register for each output link. This mailbox register will provide a mechanism to allow the CPU in this iTSE to communicate 
with the CPUs attached to its output links. At the time this is being written, there is not any known applications for this 
mailbox. 

Note: the 4 tag bits are fixed to all Ts. 
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4.3.1.6 Datapath Link CSR 

This section summarizes the Control Status Registers (CSRs) used to configure and monitor the operation of an 
individual Datapath Link module. CSRs include all configurable memory devices within this module, these devices may be 
individual flip-flops, register arrays or memory arrays. 

Note: there is an additional CSR module which contains global control information which is common to all 
Datapath Link module. This global CSR module is described in Section 4.3.2. Statistics information which is gathered by 
each individual Datapath Link Module is stored in a centralized Statistics module which is accessed bia the global CSR 
address space. e 

Unless otherwise noted, the reset value for programmable fields is 0. 

Table 4-2: Datapath Link Module CSRs 



31 



24 23 



16 15 



8 7 



Address 
Offset 

0x0000 



DemapperRam, location 0 



all 0's 



all 0's 



DemapperRam, location 1699 



all 0's 



0xlA8C 



0x1F80 
0x1 F84 
0x1 F88 
0X1F8C 
0x1 F90 
0x1 F94 



0x1 FA0 
0x1 FA4 
0x1 FA8 
0x1 FAC 
0x1 FB0 
0x1 FB4 



0x1FD0 
0x1FD4 
0x1FD8 
0x1 FDC 
0x1 FE0 
0x1 FE4 



0x2000 



0x3A8C 



unused address space 



ReceivedPDUCount (don't clear on read) 



TransmittedPDUCount (don't clear on read) 



ErroredPDUCount (don't clear on read) 



ErroredReqCount (don't dear on read) 



BIP36ErrorCount (don't clear on read) 



PeakBIP36Errors (don't clear on read) 



unused address space 



ReceivedPDUCount (clear on read) 



TransmittedPDUCount (clear on read) 



ErroredPDUCount (dear on read) 



ErroredReqCount (clear on read) 



BIP36ErrorCount (clear on read) 



PeakBIP36Errors (clear on read) 



unused address space 



all 0's 



ErrorFlags 



all 0's 



ErrorFlagsMask 



RBufPduLimit 



I LinkDemapperControl 



LinkControl 



OLinkMapperControl 



all 0's 



RxLohSync 



TxLohMailbox 



RxCaptureReg 



unused address space 



MapperRam, location 0 



all 0's 



all 0's 



MapperRam, location 1699 



all 0's 



Notes on Stats Counter ■ 



Each statistic counter has 2 addresses which it may be read from. One address will automatically clear the counter 
after the read cycle, the other address will not clear the counter. The counters will saturate at all Ts if the max count value 
is reached. 



ReceivedPDUCount 



Cumulative count of the incoming valid PDUs received on this link and forwarded to the Row Buffers. This 
counter is 20-bits wide. 
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TransmittedPDUCount 

Cumulative count of the outgoing valid (non-idle) PDUs transmitted on this link. This counter is 20-bits wide. 
ErroredPDUCount 

Cumulative count of the incoming PDUs discared due to a checksum parity error. This counter is 12-bits wide. 
ErroredReq Count 

Cumulative count of the incoming REs discared due to a BIP2 parity error. This counter is 12-bits wide. 
BIP36ErrorCount 

Cumulative count of the BIP36 errors detected each row time. This counter is 24-bits wide. 
PeakBIP36Errors 

Maximum BIP36 errors detected in a single row time. This counter is 12-bits wide. 



Notes on Error Flags - 

Error flags are classified into one of two types: (1) Configuration errors which are errors which are caused by a 
mis-configuration of the hardware, and (2) Traffic errors which are generated based on the incoming traffic stream. The 
"Type" column in the error flag description table belows indicates which type of error it is. 

ErrorFlags 

This 24-bit register contains several flags for error events which may be detected within the Datapath Link 
Module. Error flags are latched upon detection of the error event and remain latched until they are cleared by 
software. An error flag is cleared by writing a "l" to that bit position.. 



; J:i:^;: ; ;:- 










0 


re_seq_error 


Config 


Error in RE sequence in Demapper RAM. Sequence which is 
notRE 0, 1 ,then 2 has been detected. 


Input Link Demapper 


1 


re_dist_error 


Config 


Error in RE distribution in the Demapper RAM. This occurs if 
REs are spaced too close together. Each group of 3 REs need 
to be spaced at least 16 eelk cycles apart. 


Input Link Demapper 


2 


re_parity_error 


Traffic 


BIP2 parity error detected in a RE. A cumulative count of 
errored RE's is maintained in the statistics module. 


Input Link Demapper 


3 


pdu_seq_error 


Config 


Error in PDU sequence in Demapper RAM. Sequence which is 
not PDU 0, 1 , through 15 has been detected. 


Input Link Demapper 


4 


pdu_parity_error 


Traffic 


Parity error detected in a PDU. A cumulative count of errored 
PDU's is maintained in the statistics module. 


Input Link Demapper 


5 


sloLparity„errof 


Traffic 


BIP36 slot parity error detected. A cumulative count of the 
BIP36 errors is maintained in the statistics module. 


Input Link Demapper 


6 




Traffic 


unused, always read as 0. 


Input Link Demapper 


7 




Traffic 


unused, always read as 0. 


Input Link Demapper 


8 


tdmjlag_error_0 


Config 


More that cciks_per_slot incoming TDM slots are valid for this 
FIFO in a single slot period. This TDM slots which excede 
cclks_per_slot are lost. cclks_per_slot parameter is configured 
in the Global CSRs. 

This TDM FIFO services Row Buffer Addresses 0 through 4 
(mod 16). 


Row Buffer Mapper 


9 


tdmJlag_error_1 


Config 


This TDM FIFO services Row Buffer Addresses 5 through 7 
(mod 16). 


Row Buffer Mapper 


10 


tdmJlag_error_2 


Config 


This TDM FIFO services Row Buffer Addresses 8 through 1 1 
(mod 16). 


Row Buffer Mapper 


11 


tdm_flag_error_3 


Config 


This TDM FIFO services Row Buffer Addresses 12 through 15 
(mod 16). 


Row Buffer Mapper 


12 


td m_fif o_fu I l_e rro r_0 


Config 


TDM FIFO overflow. This TDM FIFO services ROW Buffer 
Address 0 through 4 (mod 16). 


Row Buffer Mapper 


13 


tdm_fifo_full_error_l 


Config 


This TDM FIFO services Row Buffer Addresses 5 through 7 
(mod 16). 


Row Buffer Mapper 


14 


td m_fif o_fu ll_e rro r_2 


Config 


This TDM FIFO services Row Buffer Addresses 6 through 1 1 
(mod 16). 


Row Buffer Mapper 
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15 


td m_fif o_f u I l_e rro r_3 


Config 


This TDM FIFO services Row Buffer Addresses 12 through 15 
(mod 16). 


Row Buffer Mapper 


16 


tdm_frfo_n e_error_0 


Config 


The TDM FIFO is not empty at the end of the row. This TDM 
FIFO services ROW Buffer Address 0 through 4 (mod 16). 


Row Buffer Mapper 


17 


tdm_fifo_ne_error_1 


Config 


This TDM FIFO services Row Buffer Addresses 5 through 7 
(mod 16). 


Row Buffer Mapper 


18 


tdm_fifo_ne_error_2 


Config 


This TDM FIFO services Row Buffer Addresses 8 through 1 1 
(mod 16). 


Row Buffer Mapper 


19 


tdm.fifo_no_error_3 


Config 


This TDM FIFO services Row Buffer Addresses 12 through 15 
(mod 16). 


now ouiioi mapper 


20 


pduJimit_error 


Config 


Set if input PDUs are discarded because pdujimit has 
been met. Normally this would be caused by: 

1. The request arbitor is configured for more PDUs than 
pdujimit. 

2. The source device is sending more PDUs than it's 
been granted. 

3. Mis-routed PDUs which are not detected by the PDU 
parity mechanism. 


Row Buffer Mapper 


21 


pdu_start_error 


Config 


Set if a PDU start indicator is received for the next half of a PDU 
while the current half PDU is still being serviced. For example, if 
input link 0 Demapper RAM is configured for a PDU on slots 0 
thru 1 5 and input link 1 is configured for a PDU on slots 4 thru 
19, and both receive PDUs destined for the same output link, 
this error will be generated. 


Row Buffer Mapper 


22 




Config 


unused, always read as 0. 


Row Buffer Mapper 


23 




Config 


unused, always read as 0. 


Row Buffer Mapper 



ErrorFlagsMask (reset state = Oxhhhhhh) 

The ErrorFlags are passed through this mask and then logically OR'ed together to generate the 
dpJink_config_error and dp_link_traffic_error output signals. The mask bit must be a "1" to enable an error flag 
to be used in asserting the error output signals. 

Even if an error flag is masked off, its status can still be read via the ErrorFlags register. 
RBufPduLimit 

This 7-bit field which specifies the maximum number of PDUs which may be stored in the row buffer in any 
single row period. 

ILinkDemapperControl 

This 8-bit register controls the operation of the Input Link Demapper module. 



;i|B>t: : ; : . 


.::::j;:'Nafne : :::. : :r"*i- : :: : : 




0 


slot_parity_sel 


BIP36 parity select, 0 = even parity, 1 = odd parity. 


1 


re_parity_sel 


Request Element BIP2 parity select, 0 = even parity, 1 a odd parity. 


2 


disable_re_par_check 


If set, errored REs are not discarded. Error flag and statistic counters will still be 
incremented if parity errors are detected. 


3 


disable_pdu_par_check 


If set, errored PDUs are not discarded. Error flag and statistic counters will still 
be incremented if parity errors are detected. 


4 


disable_rcv_tdm 


If set, all TDM traffic received on this input link is ignored, regardless of the state 
of the Demapper RAM. 


5 


disable_rcv_pdu 


If set, all PDU traffic received on this input link Is ignored, regardless of the state 
of the Demapper RAM. 


6 


disable_rcv_re 


If set, all RE traffic received on this input link is ignored, regardless of the state 
of the Demapper RAM. 


7 
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OLinkMapperControl 

This 8-bit register controls the operation of the Output Link Mapper module. 



Jl-Bit:;;; 






0 


slot_parity_sel 


BIP36 parHy select. 0 a even parity. 1 = odd parity. 


1 


re_parity_sel 


Request Element B1P2 parity select. 0 = even parity, 1 = odd parity. 


2 






3 






4 


disable_xmt_test 


if set. the Idle Pattern will be sent on ait Test slots. 


5 


disable_xmt_tdm 


if set, the Idle Pattern will be sent on ail output TDM slots. 


6 


disable_xmt_pdu 


If set, the Idle Pattern will be sent on all output POU slots. 


7 


disable_xmt_re 


If set, the Idle Pattern will be sent on all output RE slots. 



LinkControl 



This 8-bit register supplies miscellaneous control bits for other modules within the Datapath Link. 



• : ;Bits: : , 




• i :: : ! : !":, : :|: :i : : ;: !!:/V : ;!::\ : "':i^;;; : ^ 


0 






1 






2 






3 






4 






5 






6 






7 







TxLohMailbox 

32-bit field which may be inserted into the transmit link overhead slots. The Output Link Mapper RAM must be 
configured to insert this field at the appropriate slot time. 

RxCaptureReg (read only) 

32-bit capture register. Any slot on the incoming may be captured and read by software via this address. The 
Input Link Demapper RAM must be programmed in the appropriate slot time to capture the slot data. Only one 
slot should be captured in any row time since the capture register will be overwritten for each slot which is 
programmed via the Demapper RAM for the capture register. 

RxLohSync (read only) 

Bits 28:24 of the received Link Overhead Status slot may be read from this location. This location is updated for 
each slot which the Input Link Demapper RAM defines as a LOH Status slot. 



:m '■ 


.:" v ; ii: Name' " 


' y :: : :: | :: r\; : ;:: " ^^ : j.^;V VOeSCrlp^ 


0 


gnt_sync 


Bit 24 of LOH status. This bit indicates the synchronization statusof the Grant 
channel input of the device at the other end of this link. 


1 


sp_sync 


Bit 25 of LOH status. This bit indicates the synchronization statusof the Service 
Process device at the other end of this link. If the other end of this link is another 
iTSE, then this bit will always be 0. 


2 




Bit 26 of LOH status. Reserved for future use. 


3 




Bit 27 of LOH status. Reserved for future use. 



The rx_Ioh_sync_mask in the DpGlobalControl register defines which RxLohSync bits are monitored for 
generating the RxMsglrq. Even if a bit is masked off, its status can still be read via this RxLohSync register. 



# 
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4.3.2 Datapath Control 

This module implements functions which are either common to or shared by all 12 Datapath Link modules. These 
functions include: 

• CPU bus address space decoding to support the CSRs for each link and the global CSR. 

• Global Control/Status registers. 

• Timing control memory which will allow for simultaneous support for two different link configurations (i.e., how each 
of the input link row slots are utilized). This will allow a single iTSE to be used in up to 2 stages of the switch fabric 
(folded network). 

• Logically OR the error.flag from each of the 12 Datapath Link modules and output a single datapath.error flag output 
signal. 

• Statistics gathering of PDU traffic received in each row buffer. 

4.3.2.1 Datapath Memory Map 

The memory map for the entire Datapath module is as follows (the address offset is the address offset of that 
module from the base address of the Datapath module): 

• Datapath Link #0 CSR - Address offset* 0x00000, size: 16Kbytes. 

• Datapath Link #1 CSR - Address offset 0x04000, size: 16Kbytes. 

• Datapath Link #2 CSR - Address offset 0x08000, size: 16Kbytes. 

• Datapath Link #3 CSR - Address offset OxOCOOO, size: 1 6Kby tes. 

• Datapath Link #4 CSR - Address offset 0x10000, size: 16Kbytes. 

• Datapath Link #5 CSR - Address offset 0x14000, size: 16Kbytes. 

• Datapath Link #6 CSR - Address offset 0x18000, size: 16Kbytes. 

• Datapath Link #7 CSR - Address offset OxlCOOO, size: 16Kbytes. 

• Datapath Link #8 CSR - Address offset 0x20000, size: 16Kbytes. 

• Datapath Link #9 CSR - Address offset 0x24000, size: 1 6Kbytes. 

• Datapath Link #1 0 CSR - Address offset 0x28000, size: 1 6Kby tes. 

• Datapath Link #1 1 CSR - Address offset 0x2C000, size: 16Kbytes. 

• Datapath Global CSR - Address offset: 0x30000, size: 4Kbytes. 

4.3.2.2 Datapath Global CSR 

This section summarizes the Control Status Registers (CSRs) used to configure and monitor the operation of all 
Datapath Link module. These CSRs are for control which is common or shared by all Datapath Link modules. 

Note: there is an additional CSR module which contains link control information which is instantiated in each 
Datapath Link module. This link CSR module is described in Section 4.3.1.6. 

Unlike the CSR module inside the Datapath Link, the data bus interface to this module is 32-bits wide. 

Unless otherwise noted, the reset value for programmable fields is 0. 

Table 4-3: Datapath Global CSRs 



31 


24 23 16 


15 


8 7 o 


Address 
Offset 




unused 




|ldlePtrn[35:32] 


OxOFBO 


ld!ePtrn[31:0l 


OxOFB4 


DpGlobaJControl 


0x0 FB8 


LohFptrn[31:0) 


OxOFBC 




LohStuff[31:0l 


OxOFCO 


ail 0's 


LinkTrafficErrorFlags 


all 0's 


UnkConfigErrorFlags 


OxOFC4 


aJI 0's 


LinkTrafficErrorFlagsMask 


ail 0's 


UnkConfigErrorFlagsMask 


OxOFCB 


all 0's 


LinkMailboxMsglrq 


all 0's 


UnkSyncMsgfrq 


OxOFCC 


ail 0's 


UnkMailboxMsglrqMask 


all 0's 


UnkSyncMsglrqMask 


OxOFDO 




all 0's 




| ConfigErrorlsIotNum 


Ox0FD4 
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PDUcountLinkQ..ll 

32-bit count of the number of PDUs written to each links Row Buffer. This counter is cleared on read. Note: if 
address minus 0x10 is used as the read address, the count value can be read without clearing the counter. 

IdlePtrn 

Idle Pattern, 36-bit pattern which is inserted into unused transmit slots. Insertion is controlled by the Output 
Link Mapper RAM. 

DpGlobalControl 



Datapath Global Control. 32-bit register which provides control settings common to all 12 Datapath Links. 



;'Bit$7 : 


•:::i: :::::: .::::/Name; : :;" 




^Sejyajue' 


3:0 


req_valid_time 


Number of eclks minus 1 that the request element must be valid when being pre- 
sented to the arbitration logic. This should never be modified. 


0x7 


7:4 


cclks_per_slot 


Minimum number of eclks per slot. This should never be modified. 


0x2 


11:8 


rxJoh_sync_mask 


This mask defines which RxLohSync bits will be monitored for generating the 
LinkSyncMsg Interrupt request from each Datapath link module. The interrupt 
is generated whenever any unmasked bit changes state from the previous row. 
A T will unmask the bit and enable it to be monitored for IRQ generation. 
This common mask value is used by all 12 Datapath Link modules. 


OxF 












































LohFptrn 

Link Framing Pattern. 32-bit field which may be inserted into the transmit link overhead slots. The Output Link 
Mapper RAM must be configured to insert this field at the appropriate slot time. 

LohStuff 

Link Stuff Pattern. 32-bit field which may be inserted into the transmit link overhead slots. The Output Link 
Mapper RAM must be configured to insert this field at the appropriate slot time. 

LinkConfigErrorFlags (read only) 

The 12 config error flag outputs of each Datapath Link module may be read from this location. The error flags 
can only be cleared by writing to the ErrorFlags CSR in each Datapath Link module. 

LinkConfigErrorFlagsMask (reset state = Oxhhhhhh) 

The 12 config error flags are passed through this mask and then logically OR'ed together to generate the Config 
Error IRQ. The mask bit must be a 'V to enable an error flag to be used in asserting the IRQ. Even if an error 
flag is masked off, its status can still be read via the LinkConfigErrorFlags register. 

LinkTrafficErrorFlags (read only) 

The 12 traffic error flag outputs of each Datapath Link module may be read from this location. The error flags 
can only be cleared by writing to the ErrorFlags CSR in each Datapath Link module. 

LinkTrafficErrorFlagsMask (reset state = OxFfhh'FF) 

Operates identically to LinkConfigErrorFlagsMask. 

LinkMailboxMsglrq 

The 12 LinkMailboxMsg IRQ outputs of each Datapath Link module may be read from this location. The IRQ 
events are latched upon detection and remain latched until they are cleared by software. An IRQ is cleared by 
writing a T to that bit position. 

LinkMailboxMsglrqMask (reset state = OxFFFFFF) 

Operates identically to LinkConfigErrorFlagsMask. 



f7 
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LinkS yncMsglrq 

The 12 LinkSyncMsg IRQ outputs of each Datapath Link module may be read from this location. The IRQ 
events are latched upon detection and remain latched until they are cleared by software. An IRQ is cleared by 
writing a "1" to that bit position. 

LinkMailboxMsglniMask (reset state = OxWfr'FF) 

Operates identically to LinkConfigErrorFlagsMask. 

ConfigErrorSlotNum 

When a LinkConflgError is generated, the current input link slot number is latched. The latched value is 
automatically cleared when it is read. Only when this latch is all O's will the slot number be latched upon a 
LinkConfigError. This means only the first LinkConfigError which occurs during a row time will be captured. 
This is intended to help the wayward software engineer isolate where in the row the configuration error is. 
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5 Data Multicast Description 

This section will describe how the data multicasting & broadcasting is implemented within an 
iTSE. This section covers the multicasting & broadcasting of data PDUs only, the concept of 
multicasting TDM traffic is covered elsewhere in this specification. Note: in this section the terms 
"PDU" and "packet" are synonomous. For the ITSE. "broadcast" can be viewed as simply another form 
multicast (with a lot of destinations). Therefore, this section will only speak of "multicast" but the 
reader should realize the this will also be how broadcast is implemented. 

5.1 Design Objectives 



• Support a one-to-many multicast scheme. Many-to-many multicasting will not be explicitly 
supported but may be possible depending on the complete system architecture. 

Routing tables at the oTPP which will contain the new VPI/VCI for the outgoing cell. This table 
will be indexed by the Multicast Flow Identifier. 

5.2 Multicast Operation 

In order to support mulitast operation in switch fabrics which could be as small as 12x12 
ports to as large as 1728x1728 ports, the iTSE will support several methods of multicast operation: 

• multicast mask 

• broadcast 

• multicast ID 

• fTPP-based multicast 

The multicast controller within the iTSE implements a store and forward mechanism. This 
means that all PDUs which may be multicast from a given iTSE will be written to a buffer within the 
multicast controller (MC-CTLR), from there the iTSE will multicast the packet to various destinations. 
Note: the normal Req-Grant arbitration mechanism is used to get link bandwidth for transmitting the 
MC PDUs. 

5.2.1 Multicast Modes 

The multicast modes of operation will be briefly described here. The next section will provide 
some examples of multicast operation which may make understanding these operating modes a bit 
easier. 

Depending on the size of the switch fabric, several of these multicast operating modes may be 
used in conjuction to provide an effective multicast solution. 

Multicast Mask 



5.2.2 Multicast Examples 

The iTSE is designed to support a multi-stage packet duplication architecture as shown in 
Figure 5-1. In this example architecture, the input iTPP will unicast the multicast packets to a 
multicast controller in stage 1 (switch chip 1.0 in this figure), from there the multicast packet will be 
duplicated and sent to the multicast controllers in the third stage. The third stage multcast 
controllers will again duplicate the packet for each output link it is destined for. 

The multicast controller implements a store and forward mechanism. This will be explained 
by using the multicast example in Figure 5-1. In this senlaro multicasting will operate as follows: 

1. The iTPP will send the multicast packet to the multicast controller (MC-CTLR) in switch 1.0. 
The MC-CTLR will have buffers in which it will store the received multicast (MC) packet. 

2. The MC-CTLR will then duplicate the packet 2 times so that it will now have a total of 3 copies 



^ _„0,. 
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of the packet. Then for each packet it will replace the routing tag and MC forwarding tags (see 
the MC PDU format shown in TBD). The packets destined for switch 3.0 and 3.2 will be given 
the paths to those switch chips. The packet that is being routed through switch 3. 1 doesn't 
need to be multicast from that switch chip, therefore it will be given the path to send it 
directly to the output iTPP. 

The MC-CTLR will then forward the 3 MC packets to the 3 destinations (switches 3.0 and 3.2, 
and the iTPP off switch 3.1). This forwarding will again use the normal Req-Grant arbitration 
mechanism for obtaining bandwidth. 

At switch chips 3.0 and 3.2. the MC packets will be duplicated and sent out on output ports 
which are specifed in the MC copy field of the MC packet. Since this is the last multicast 
stage, the MC packets do not need to be modified prior to transmission. The identical MC 
packet is sent on all the appropriate output links. Of couse, the normal Req-Grant arbitration 
mechanism will be used to get link bandwidth from the last MC stage to the iTPP. 



Stage 1 Stage 2 Stage 3 

2.0 




2.4 

□ 



Figure 5-1: Multicasting within the iTAP Switch Fabric 

For the MC-CTLR at the first stage (switch 1.0 in Figure 5-1), we see that it will need to modify 
the routing and MC copy tags within the MC packet prior to transmitting the duplicated packets. This 
means that the MC-CTLR will need to have information which is specific to each MC flow which may- 
use the MC-CTLR as a copy forwarding stage. The following parameters will need to be stored in the 
iTSE MC-CTLR prior to the arrival of the MC packet: 

Per MC Flow - 

• 1 1 bits to identify the source fTPP of the MC packet. 

• 14 bits to identify the MC Flow ID as assigned in the sourcing iTPP. 

• 4 bits to identify the cache entry to use (this is discussed later). 



3. 
4. 



Per output destination to which the packet will be multicast to - 

These bits will be used to replace the fields currently in original multicast PDU. 

• 28 bits for routing tag. 

• 12 bits for MCcopy tag. 

• 1 bit for the new broadcast field. 

• 1 valid bit which defines whether this is a valid entry, if clear then ignore this entry. 



SO 
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These bits will be used in the creation of the request element used for link BW arbitration. 

• SbitsforVoqED. 

• 3 bits for priority. 

• 3 bits for res field. 

The RouteTag will be the same one used above when copying the PDU. The Reqld will be a 
created by the iTSE multicast controller in order to identify the returning grant. 

Thus, for each destination which the multicast PDU must be copied to we need a total of 53 
bits of information. 



Stage 1 



# on Link indicates # of 
copies of same PDU. 
If no # shown, then it's T. 



Stage 2 
2.0 



Stage 3 




Figure 5-2: Multicasting w/ Multicast ID Mode 

5.2.3 Single PDU 

The figure below illustrates a typical arbitration and data passing sequence fora data PDU. 
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Ingress Link 



Stage1_2Unk 



Stage2_3Unk 



Egress Link 



Stage 1 



nrsEi 



RouteTagp 1:0] = 0x462 
This is used on ingress PDUs 
and REs. 



Row 



SlotO 

IngressLink. Data I IMCR& J 



IngressUnk, Grant | 
ROW N+1 IngressUnk, Data | 



Row N+: 



•c 



*Stage1_2Unk, Data I |MCRE | 



Stage1_2Unk, Grant L 
ROW N+3 Stage1_2Unk, Data Q 



Row 



J' 



Stage2_3Link, Data |_ 
Stage2_3Unk f Grant Q 



(MCflE I 



Stage 2 



rrsE2 



Stage 3 






rrsE3 

2 * * - . 

4 










oPP 




1/2fTPP 









Row Contents 



[MCGE I 



POU; 



[MCGE | 



[MQGE I 



ROW N+5 Stage2_3Unk, Data £ 



pEgressLink, Data 
Row N+tW c 

L E 

ROW NV7 EgressUnk, Data £ 



|MCRE| 



EgressUnk, Grant 



MCGE] 



Each Unk has 2 components 



Data Link 



Grant Link 



Slot 1699 



| POU | 



Figure 5-3: Single MC PDU Through a 3-Stage Switch Example 
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5.3 Multicast PDU Structures 
5.3.1 Multicast Data PDU Format 

The format for the 16-slot multicast data PDU is shown below. The shaded fields are identical 
to the fields in the unicast data PDU format are are described in Section 2.2.4.1.. 

Table 5-1: Multicast Data PDU Format 



Slot 35 31 24 23 16 15 8 7 



0 




PPM: 


BCMC 




1 










McCopy ii^^pif 


2 


Parity : ' 


McCacheTag 


- - - SrcPortld SrcMcRowld 


3 














:• ?*$f 




15 


Parity jj 


U Payl^^Byte^4S :!!'• 


/; V ;payi^^Byt(&^49 "'K- 


'j;!; : ':Payjoaa^B^frX^ ! : ;:;;; 





Note: reserved bit positions are indicated with a The default state for these bits is 0. 



BC - Broadcast 

This bit will be set if this is a broadcast PDU. 

MC - Multicast 

This bit will be set if this is a multicast PDU. 

McCopy 

12-bit Multicast Copy field. This field identifies which output link this multicast PDU must 
be multicst to. McCopy[0I is for output link #0, McCopyfll is for output link #1, etc. If all 12 
bits are cleared and the MC field = "Or, then the multicast controller will use the multicast 
cache and perform packet duplication based on the contents of the cache. 

SeqNum 

SeqNum is the fragment sequence count, it will be incremented for each fragment. The 
SeqNum will start at 0 for the first fragment. If there are more than 16 fragments to the PDU, 
this SeqNum will roll over past 15 and continue counting. 

McCacheTag 

4-bit tag which identifies the multicast cache entry to use when duplicating this multicast 
PDU. 

SrcPortld 

This 1 1-bit field identifies the source port processor for this multicast PDU. 
SrcMcFlowId 

This 14-bit field identifies which multicast flow this PDU is associated with. This is the flow 
ID from the input Port Processor's multicast ID space. 



5.3.1.1 Background on Multicast Data PDU Fields 

This sections provides a bit of background on why the fields which are present in the 
multicast data PDU defined in Table 5-1 amd how they are used. 

BC & MC bits - 

The usage of these bits will depend on who is processing the packet. 
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For the iTSE multicast controller the algorithm shown in Figure 5-4 is used. 




BC-0&&MO1 



Copy PDU, update 
McCopy, BC, RoutaTag 
from cache entry. 
Send PDU on output 
Qnks per cache entry. 



Discard PDU 



Copy unmodified PDU 
to all output links 



Copy unmodified POU 
to output links per 
McCopy field 



Discard PDU 



Figure 5-4: iTSE MC Controller Processing of BC & MC Bits 



For the output Port Processor the algorithm shown in Figure 5-5 is used. This algorithm 
assume the iTPP already knows that it is dealing with a multicast packet. The current plan is that all 
multicast/broadcast PDUs will be sent a virtual output queue which is reserved for MC/BC packets. 



m a i r» t\r\r\r\ 
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Use hashing algorithm 
to search a table to 
see if this PDU Is an 
active flow. Use 
SrcPortH & SrcMcRowld 
as hashing Inputs. 




Copy Cen/Reassembled 
packet to alt output 
PHYs per hash entry. 



Discard POU 



Figure 5-5: iTPP Processing of BC & MC Bits 



McCopu - 

This field is only used by the iTSE multicast controller. It is used to determine which output 
links the PDU should be copied to. See the flowchart in Figure 5-4 on how this field is used. 

The port processor will always ignore this field. 



McCacheTag - 

Because we're implementing a store and forward technique within the iTSE multicast 
controller, we'll need some information (more than what can fit in the MC PDU header) which 
determines where to forward the MC packet to and the new parameters which need to be replaced for 
each forwarded copy. This information will be stored in a cache within the iTSE. The cache entries are 
loaded by having the input iTPP forward a "MC Parameter" PDU prior to each MC data PDU. Since a 
PDU is only 64 bytes the MC Parameter PDU will only be able to carry enough information to allow the 
MC data PDU to be copied to 7 unique destinations. 

Since 7 destinations may not be enough for a multicast flow, a mechanism is needed to allow 
for more destinations. This mechanism will be the "Cache Tag" which will allow a single MC flow to 
have multiple valid cache entries. 
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5.3.2 Multicast Parameter PDU Format 

The format for the 16-slot multicast parameter PDU is shown below. 

Table 5-2: Multicast Parameter PDU Format 
35 31 24 23 16 15 



Slot 



8 7 



Parity 



Parity 



Parity 



Parity 



110 0 



McCacheTag 



BC - 



RouteTag 



SrcPorttd 



SrcMcFlowld 



RouteTag 



Voqtd 



res Priority 



McCopy 



Slots (addresses) 2 & 3 contains the cache entry parameters for 1 destination. Six more 
destinations may be defined in the same manner using slots 4 thru 15. 

Note: reserved bit positions are indicated with a The default state for these bits is 0. 

Slot 0 - 

This word is used to route the Multicast Parameter PDU from the input iTPP to the iTSE 
multicast controller. It's fields are those for a unicast packet as defined in Table 2-3. 

Slot J - 

This word identifies the MC session parameters for this cache entry. The fieds will match 
those from the multicast data PDU which will be following this parameter PDU. 

Slots 2 8L3- 

Thise two words contain the cache entry parameters for 1 destination. When the MC PDU is 
duplicated the BC, RouteTag, and McCopy fields will be replaced with those in these slots. 

The VoqID, res, and Priority fields are used (along with the RouteTag) for creating the request 
element which will be used to arbitrate for link bandwidth for sending the duplicated data PDU. 
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5.4 Port Processor's Role in Multicasting 
5.4*1 Input Port Processor 

5.4.2 Output Port Processor 

The output JTPP will receive multicast packets from the switch fabric. For each MC packet it 
must determine (1) if the MC packet is a member of one of the potentially 16K active outgoing MC 
flows which the 1TPP can support and (2) which Utopia PHYs the packet must be multicast to. 

In order to determin if the MC packet is a member of an active MC flow, the flPP will need to 
perform a hashing algorithim on the fields from the MC PDU which identify the MC flow. These fields 
are the SourcePort 

The Port Processor board may potentially be supporting up to 192is designed with multiple 
PHY ports attached to the fTPFs Utopia bus. 

5.4.3 Multicast PDU Latency 

Since the multicasting mechanism uses a store and forward approach, it will take longer to 
get multicast packets through the switch fabric. In the example of Figure 5-1. the best case time for 
sending a MC packet through this 3 stage switch fabric is calculated as follows: 

1. 2 row times are need to get the MC packet from the iTPP to the 
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6 Multicast Implementation 

Possible 2nd Tensilica which Is used to handle the multicasting is here. 




^ ^ s w ft -r— ^« 



Proprietary and Confidential Information ofQnex Communications Corporation 



7 Link Bandwidth Arbitration 



7.1 Theory of Operations 
7.1.1 Overview 

Each port processor operates without any knowledge of what the other port processors are 
doing. As a result, when they go to send their PDUs, they need to know 2 things: 

• Does the output port processor have room in its queues for this PDU? 

• Is there bandwidth in the choosen path to get the data from one end of the switch fabric to the 
other without packet loss? 

The arbitration mechanism will check both of these two criteria and send back a grant to the 
requesting port processor on a PDU by PDU basis. When a port processor has been given a grant it 
knows for certain that the data will make it to the output port processor (barring system failture). 

Each row time, the port processors will make a request for each group that it wishes to send 
data on in the next row. This request will be a message which is broken into 96 request elements, one 
element for each possible data group requested. These request elements will be multiplexed in with 
the data stream (see Overview & Datapath chapters). The requests stream through the switch and are 
'knocked out' based on a priority field. Since the 12 inputs could all converge on a single output, the 
outgoing link will not be able to handle the traffic presented to it. The highest priority traffic should 
be allowed to go through the switch fabric. A small buffer pool exists in each output link to hold some 
of the requests when multiple requests come into the switch chip which are destined for the same 
output link. At the far end of the switch fabric, the port processor will make a decision to grant or 
deny a request based on its QOS queues. The port processor will then source a grant message which 
also travels through the switch fabric, but in an out-of-band overlay network which goes in the 
opposite direction of the switch fabric. The grants will be written without regard to priority into a fifo 
and read in order of arrival time. 

7.1.2 Basic Algorithm 

The arbitration mechanism will work as follows: 

1. At the start of a row time the input port processor will begin outputting its request message, made 
from a stream of request elements. A request element is a request for a single group's worth of band- 
width in the switch fabric destined for a particular port processor. The format of the request elements 
are shown below. 

2. The first stage in the switch fabric will look at each request from all 12 input links as well as 
the multicast and control message controller. The requests traverse the switch fabric by using 
a self routing tag which indicate the hop-by-hop output ports used at each stage of the switch 
fabric. At this time, the Stage 1 hop-by-hop field will be replaced with the input port number 
that the request entered on. This parser logic will be able to handle all the requests from all 
14 request sources within a single request element time. 

3. The requests for each output link will be stored in a buffer pool. As long as buffers are free, 
requests will be stored. As soon as there are no free buffers, lower priority requests will be 
overwritten with higher priority ones. The request buffers will be able to support 12 input 
links all converging on a single output, meaning that 12 request elements can be written to 
the buffer pool every 'request' time. Requests are evicted from the buffer pool based on prior- 
ity and age. The youngest lowest priority requests will be dropped, and the highest priority 
oldest requests will be kept. 

4. After a programmable amount of time, the request buffers will be read from by the switch 
Mapper. After a request is read the request element deleted from the buffer, making room for 
another request element. The output Mapper will only read 96 request elements- it will not 
over-request an output link. Any requests still in the buffers will be dropped. 

5. This happens all the way to the end of the switch fabric and into the port processor. The port 
processor will make a decision to accept or reject the request based on the QOS field. Then, it 
will source a grant message. The grant message uses the modified self rouUng tag of the 
request element to traverse the switch fabric backwards using an overlay network. 
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6. The grant path in the switch uses another instantiation of the parser logic and a set of buffer 
fifos which get written to and read out of based on arrival time. (The links will be scanned In 
order and written into the fifos in that same order- linkO to link 1 1. In addition a mapper and 
demapper will be used to determine where in the link the grants should be placed. It is the 
intent that they are all adjacent to each other In the row. 

7.1.3 Folded switch fabrics 

RESERVED 

7.1.4 Multicast Support 

In addition to the the 12 input links, provisions need to be made for multicast traffic as well 
as request messages made by the local processor. Multicast request elements that flow into a switch 
will flow through the switch fabric the same as standard unicast request elements. At the point where 
the message needs to be multicast the hop-by-hop field's bit code for that switch stage will Indicate 
that the request is multicast. The request will be forwarded to the multicast controller. On the grant 
path, the multicast controller will simply source a grant if there Is room for the data in the multicast 
recirculating buffers. Once the data has been transmitted to the multicast buffer, the multicast 
controller will examine the data header and determine which output links it needs to be sent out on. 
At this point, it will source a number of request messages which will look to the request-controller as 
if the switch had 13 inputs, not 12. They will be handled the same as unicast requests from one of the 
input links. 

The arbitration algorithm has been designed with the Intents that each request message will 
request a single group. To aid the multicast manager, however, there is a bit in the request/grant 
element which indicates if the request is for 1 or 2 groups. 

7.1.5 Arbitration Message Element Format 

Below are the actual bit patterns of the request and grant elements. Each of these request 
elements requests for a single data group on the switch link. It takes 96 of these to request an entire 
iTAP row. Serialized versions of these stream between switch elements. 

The request/grant messages will take the form of a self routing message with a 3 bit priority, 
7 bit sequence number and either a 5 bit Virtual Output Queue ID or congestion indicators. The 
Request Element has 2 forms- one is for a 7 stage network, the other is for 5 or smaller stage 
networks. In these cases the 6 and 7th stage self routing tag is replaced with a QOS field. The switch 
should simply pass the bits which are set in these fields. 
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Table 7-1: Request Message Element (7 Stage) 
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Table 7-2: Grant Message Element (7 Stage) 
Stage l...Stage7: The Stage fields indicate the output link number that the given switch chip 
should forward the message to. Valid request messages are those with output link numbers 
between 0 and 1 1 . Other values are reserved for special commands. Each switch chip will need to 
know at which stage of the network it is located. 
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Bit Field 
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Figure 7-3: Stage X Bit Field Codes 



• 90S: This 8 bit field refers to the queue which the data is destined for in the Output Port 
Processor. The Output port processor is expected to use this number to determine if it should 
grant or reject the request. 

• OutputPortID: This is a 7 bit field used for the output port processor to know which virtual 
output queue to place the data. 

• ReqID: As grants stream back to the input port processor which made the requests, the port 
processor needs to have a way to identify which grant goes with which request. By providing an 
identification number field, the port processor has a way to quickly associate the grant with a 
specific request. It is expected that the port processor will insert 0 into the sequence number for 
request 0 and 1 for request 1, all the way up to 93 for the request 93. 

• Num: These 2 bits indicates if the request/grant is for 1,2,3 or 4 groups. Although it Is expected 
that the normal port processor data flow will be on a group by group request basis, the multicast 
controller will operate more efficiently if it has the ability to easily arbitrate for multiple groups. 
This Is needed to simply speed up the passing of the multicast control packets. The coding of 
these bits Is as follows: 
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• Priority: This is a 3 bit field to indicate the importance of a request. Highest priority is seven 
(Obi 1 1), lowest priority is zero (ObOOO). This priority is the priority of the request or grant element 
through the switch fabric. 

• Res: These bits will be carried through the switch fabric by the fTAP Switch, the port processor 
may do what it wants to with these bits for passing additional signaling information across the 
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switch fabric. 



7.1.6 Calculating slot numbers of request element arrivals 

The mapper and demapper rams on the data path will have entries In them for the request 
elements. The equations below are useful for calculating which entries in the rams will be setup for 
request elements. 

These equations assume that the specified 3.. .8 slot timing is used, and that multiples of 16 
slots will be used unbroken between programmable fill time of 4 is specified. 4 refers to the number of 
request element times to wait before outputting request elements from any switch stage. Since the 
request elements are sent in pairs, this is an even number. Due to the internal timing of the switch, 
there is always a fill time of 2- otherwise nothing would have been written into the buffers yet. 

There are several equations below. The first equation generates the starting slot number 
which request elements will come in to a given switch stage at. The second equation generates the slot 
number which request elements will start leaving a switch stage at. The third equation is used to 
calculated the slot numbers which the row demapper ram should be programmed for. 

Dependent Variables: 

• StageNum - the stage number of this switch element. Values shall range from 1 to 7. 

• PFT- programmable fill time, an even number which is the number of request elements that the 
input links are allowed to source before the output link starts up. Values range from 2 (minimum 
to XX (maximum) and are even numbers. It is expected that 2,4 and 6 are the values used by the 
switch fabric. 

• PLD - Pipeline delay for output mapper. Nominally this is 18 clock cycles ~ 9 slots. 

• RET - 1 1 , the number of slots in the request... data... request... data pattern. 

• PRET- 32, the number of slots used to ofTset th 

• PPDelay - the number of slots that the port processor waits before sending any requests. This 
number should be zero. 

Once the starting numbers have been calculated, the following equation is used to calculated 
the slot numbers that the demapper ram should be programmed to for a particular element. Since 
every request element spans 2 adjacent slots in the chosen timing, the equations below output the 
first slot number that the request appears on. 



7.1.7 Request-Grant Arbitration Cycle Timing 

The arbitration cycle timing for varying sized switch fabrics and fill times has been calculated. 
The following has been assumed in all of the calculations: 

• Request Elements are mapped into the row as 3 slots carrying 2 request elements followed by 8 
slots of data. This yields 1 request element every 45ns. 

• Grant elements are mapped into a 2.2gbps serial stream as 3 slots carrying 2 grant elements. 

• The time it takes an output port processor to accept or deny a request element is 125ns. 

• The electrical delay between the port processors and switch fabric is set to 500ns. This delay 
occurs 4 times in 1 round trip. 

The Request Elements travel at a rate of 1 every 45 ns, grant elements travel at a rate of 3 per 
2 slots, with no padding between groups of 3 at a data rate of 2.2gbps, this is 1 every 25 ns. The 
grants can be sent faster than they will be received by the switch fabric. As a result, the last grant 
element timing will be based on the speed of the request message rather than grant speed since the 
request element is the limiting factor. 

For each network, the following times are given: 

• First RE in: the time that the first request element gets to the output port processor 

• Last RE in: the time that the last request element in a message gets to the output port processor 

• First GE in: the time that the first grant element gets back to the input port processor 

• Last GE in : the time that the last grant element gets back to the Input port processor 







frill Time ; 




' :|;:;: : ;S iiifiiir); ;!;.:;;; e 




Proprietary and Confidential Information of Onex Communications Corporation 











it Stage Switch Fabric ! 




: :::;;;;;;; 




1st RE in 


1253 


1298 


1343 




Last RE In 


5573 
2570 


5618 
2615 


5663 
2660 




ilstQEMrf! 




Last GE In 


6890 


6935 


6980 


; : 2 Stage Switch: Fabric ; 














1433 


1523 


1613 




Last RE In 


5753 


5843 


5933 




: ::liljOf Jni!: 


2865 


2955 


3045 




Last GE In 


7185 


7275 


7365 


3 Stage Switch Fabric 












1st RE Jn 


1613 


1748 


1883 




Last RE in 


5933 


6068 


6203 




\A9tOE ir*:: 


3159 


3294 


3429 








UstGein 


7480 


7614 


7749 


■ : .A. Stage Switch Fabric : 












;;istR£in; 


1793 


1973 


2423 




: Last RE In : 


6113 


6518 


6743 




: 1st GE in 


3454 


3634 


3814 




Last GE jn 


7774 


7954 


8134 


5 Stage Switch Fabric 












1st RE lii. : 


1973 


2198 


2423 




Last RE in 


6293 


6518 


6743 




IstGEIn 


3748 


3973 


4198 




Last G E in 


8068 


8293 


8518 


6 Stage Switch Fabric 












1st RE In ; 


2153 


2198 


2423 




Last RE In 


6413 


6518 


6743 




IsiGEIri; 


4043 


4313 


4583 




Last GE In 


8363 


8633 


8903 


7 Stage Switch Fabric 












1st RE In.;; 


2333 


2648 


2963 




Last RE In 


6653 


6968 


7283 




1st GE In 


4337 


4652 


4967 




Last GE In 


8657 


8972 


9288 



The following calculations show the travelling of the request messages through the iTAP 
Switch Fabric. The following delays are encountered when a request element is sent by a port 
processor. 

• Ted: Electrical delay. Start of Row is asserted, the entire request message has been assumed to 
be built already by the scheduler. The time to read the row and the electrical delay associated 
with getting the signal from the port processor to the switch fabric is 500ns (maximum). 

• Trm: Request Message Transit Time. The transmission time of a single request dement is bound 
up in the sending of 2 request elements in the 3 slot... 8 slot timing. As a result it takes 1 1 slots to 
transmit 2 request elements. There will be 48 of these 3.. .8 timing groups. Therefore the width of 
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a request message is over 48x1 1 slots a 528 slots. A slot is transmitted by the JTAP link in 
8. 17ns, so a request message will be 4.314us wide. 

• Tgm : Grant Message Transit Time. The transmission time of a single grant element is bound up 
in the sending of 2 grant elements in 3 slots. The grants can be packed into the row as tightly as 
possible. Therefore the fastest grant message time is in 96 grants • 1 .5slots per grant =144 slots. 
Over the grant link, the bandwidth is 2.2gbps, so that a 36 bit slot takes 16.36ns. Therefore the 
grant transit time is 2.355us. But since the request message is slower than the grant message, 
the size of the grant message will be the same as the request message since the request elements 
feed the grant stream. Tgm= Trm... 

• Tpft : Programmable Fill Time. It is possible to allow the request buffers to start filling up before 
any requests are sent out. Although this adds time to the round trip timing of the request 
message, it helps ensure that the highest priority requests are fowarded at each stage of the 
switch fabric: This fill time would be manifested in the way that the mapper ram is programmed. 
The slots where the request elements would be programmed would be set deeper into the mapper 
RAM so that they occured later in the row. Although this setting could be anything, multiples of 
the 3-8-3-8 timing are used below as a practical example. The programmable fill time must be an 
even interger multiple with a minimum value of 2. The calculation from Tpft to slots is simply 11* 
0.5 • 1}pft. . In the examples a fill time of 4 has been choosen. Ibis implies that (8+3) • 2 slots 
occur before the request elements are output; 22slots = 180ns. 

• Tpl: The fTAP Switch chip has an internal request pipeline latency of 18 clock cycles (9 slots) for 
the request elements. 8. 17 * 9 = 73ns 

• Topp : Output Port Processing Time. This is given as 32 clock cycles, which @250MHz is 16 slots. 
16 slots will be the metric used {which is a full group time). This is 130ns 

• Tgdl: Grant Delay. 3 Slots needed to extract and switch grant elements, 1 slot to process, and 3 
more to assemble the grant elements= 7 slots. 16.36ns * 7 = 1 14.52ns. 

Below is a picture illustrating an arbitration cycle's request and grant messages flow through 
a 7 stage switch fabric with the fill time set to 4 request elements. 
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Figure 7-4: Arbitration Cycle In- Flight Timing 



7.2 Link Bandwidth Arbitration Implementation 

7.2,1 Request Implementation 

The request path is implemented in 2 major functional units. These are the request parser 
and the request arbiter. 

Below is a top level block diagram of the arbitration logic. The Request Parser examines the 
hop-by-hop self routing tag of the request elements and forwards the requests to the appropriate 
output link request logic. It needs to replace the 'current* stage number field with the input link 
number which the request came in on. This keeps a record of the reverse path so that the grant can 
get back to the input port processor. The output link logic will handle 2 requests every clock cycle, 
looking at the requests and writing them to the request buffer pool. The Output Buffer Logic reads the 
buffer pool and sends out the highest priority requests. The request parser is instantiated once in the 
design for the request elements, and the output link buffer logic is instanted once per output link (12 
times total for the request elements). 
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Figure 7-5: Request Arbitration Logic 
At the top level the I/O for the request grant path will be the following 
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Table 7-6: Request Arbitration Top Level I/O 

7.2.1.1 Request Parser 

The request parser is responsible for taking the 12 input links, multicast controller and 
inband messaging controller request elements and determing which output links they need to goto. 
The parser needs to have the ability to process all of the request elements every 8 clock cycles (which 
is the maximum incoming request element rate). Since there are 14 inputs, 2 inputs will be processed 
every clock cycle with a spare clock cycle in case something can't make timing and needs to be 
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registered. 

The request arbitration mechanism will receive request elements from the datapath. Below is 
a timing diagram of the interface. 



cclk 




rcq X request element u )T~ request element 1 
lUnhjeqjitart 1 I | | 



Figure 7-7: Request Parser Input Timing 

Upon every ilink_req_start a new request element is ready. The req-start signal from all of the 
input links will be registered, and every 7 clock cycles will be tested to see if a request element is 
ready. The fastest that these req-start signals can come in is every 8 clocks, so the request parer is 
assured of getting to all of the input REs. This interface is duplicated 12 times on the input of the 
request parser (1 for each input link). 

Below is a timing diagram showing the timing for the output of the parser: 
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Figure 7-8: Request Parser output timing 

Request Elements are valid for 1 clock cycle coming out of the parser. Along with the request 
elements there is a 12 bit vector which indicates which of the 12 output links the request element is 
destined for. In the figure above, req(a/b) bus holds the request elements, the notations indicate 
which input link they came from. The figure shows a strict encoding of the links, however due to 
design requirements of the request arbiter 'reqa* will always have the higher priority request. So, for 
timing on the first request element, linkOO and linkOl will always be written into the buffer pool first, 
but the parser may switch which link is output on reqa- if linkOl's request was a higher priority it 
would go out on reqa. 

Request which come into the parser may be made available on the next clock cycle or as great 
as 7 clock cycles later. It depends on which input link they came in on, and the current input link pair 
that the parser is working on. 

The top level signal I/O for the request parser is given below: 
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7.2.1.1.1 Request Parser Design Notes 

The request parser latches the logical pulse of the 'request start' signal to indicate whether or 
not the input link has a valid request. The rate of these start signals is equal to or greater than the 
processing time needed to write the requests into the request buffers. Since the request logic can 
write the buffers into the buffer pool 2 at a time, on every clock cycle, a 3 bit free running counter will 
examine the 'request valid signals* and process the request. 

1. Latch the logical 'request start' pulses which occur at a maximum rate of 1 every 8 clock cycles. This 
signal goes high for 1 clock to indicate that a new request element has been assembled. 

2. Every clock cycle look at 2 of these latched request starts. Input link 0 and 1 will be processed 
first, then 2 and 3 and so on. If the request start latched signal Is not set. this input link will 
be considered 'idle* and not be processed. 

3. Pull out the 4 bit self routing tag which is Valid* for this input's stage number. Replace with 
the input port it came in on. Do this for both elements. There shall be several programmable 
registers which indicate to the parser which stage of the switch that particular input link is a 
member of. 

4. Sort the 2 request elements priority so that the 'a* request element output is the higher of the 
2 priorities, (this is a requirement of the buffer pool). 

5. Register the request elements along with the signals which indicate which output port it is 
destined for These are Valid signals' for the 2 request elements. 

6. Clear the latched Request Start signal. 

7. Increment Counter. 

• Each input stage needs to have a 3 bit field programmed which sets the stage number that the 
input link is located at. Hiis allows the switch fabric to exist in multiple stages of the switch 
fabric. 

• Multicast and In band control messages also input a 12 bit field which indicates which of the 12 
output links it should go out on. 

• At the beginning (or end of) every row CLEAR all of the latched signals so that the following row 
time doesn't have the previous rows request elements. 
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Figure 7-9: Request Parser Block Diagram 

7.2.1.2 Request Arbiter 

The request arbiter will be instantiated once per output link. It will connect to the request 
parser. The purpose of the request arbiter is to provide a small pool of requests (24) from which the 
highest priority oldest stored request will always be made available to the row mapper module. The 
input timing to the request arbiter matches that of the parser. The worst case output timing is given 
below: 



cclk 



ollnkjreg_ 



o!lnk_re_rd 



Figure 7-10: Request Arbiter Output Timing 

The output logic must be able to supply a new request element every 2 clock cycles. 

When a request becomes Valid,' It is written to the buffers on the following clock cycle. After it 
has been written into the buffers, it is available to be output from this module on the next clock cycle. 

Requests are written to the buffer pool by using the following rules (request A always has 
priority equal to or greater than B): 

1 . Request A always will have a priority equal to or greater than request B. 

2. Request A will be resolved before Request B w.r.t. free buffers and evictions. 

3. If there are any empty buffers, the requests are written into them before anything is evicted. 
Request A will be allocated to the first free buffer, if another free buffer exists request B will be 
granted it. 

4. If there are not enough free buffers, the lowest priority youngest request which Is in the buffer 
pool is evicted if its priority is less than that of the incoming request elements. 

5. Situation: Request A Is priority 5, request B is priority 3. there is 1 free buffer and the lowest 
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priority request in the buffer pool is of priority 4. Request A will be written to the free buffer 
and request B will dropped. Therefore, the request buffer pool will always represent the high- 
est priority requests that have come in at that time. 



The top level I/O for this module Is given below: 
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reqb_en 
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Enable flags for the 12 output links for request b 



Table 7-11: Request Arbiter Top Level I/O 



7.2.1.2.1 Request Arbiter Design Notes 

The Buffer Pool will consist of buffering the actual request elements as well as a TimeStamp 
field which is unique per priority class. This allows the buffer pool manager to know which request 
elements can be overwritten (lowest priority, newest requests), and which request elements should be 
forwarded to the next switch element (highest priority, oldest requests) . Below is an illustration of the 
buffer: 
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Table 7-12: Buffer Pool Storage Element 



As requests come streaming into each output link buffer pool, there will be a controller which 
determines if the request should be dropped, written to an unoccupied buffer or overwrite a particular 
buffer which contains a lower priority request. Each request will be given a timestamp unique to its 
priority class as well as a timestamp/priority pair which is used to search the buffer pool. This pair 
will be compare against the contents of each of the buffers in the buffer pool. The buffer searches are 
completely decoupled from one another so that this will not be a critical path which needs to ripple 
across all 24 buffers. In the event that a buffer is empty another set of signals will uniquely identify 
that buffer, and the request element will be written into it. For each request element here are the 
pieces of information which will be given to each request element: 

• Overwrite TimeStamp - This is the timestamp which will be used to search the buffer pool 

• Current TimeStamp - This is the timestamp which the request element has been assigned 

• RequestWriteEnable - Active hi, Indicates to search the buffer pool using the Overwrite Time 
Stamp 

• Empty Buffer Enables[23:0] - A vector with only 1 of its bits active which assigns the request 
element to a particular buffer which is known to be empty. 

This information will be used by the buffer pool 'write* logic which is shown below. 
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Figure 7-13: Buffer Pool Write* Logic 



The buffer pool write logic will be controlled by a buffer pool manager. This buffer pool 
manager will need to keep track of the tlmestamps as well as which buffers are empty. The following 
counters will be kept: 

• NextWrite TlmeStamp (8 counters- 1 for each priority) 

• NextRead TlmeStamp (8 counters- 1 for each priority) 

The updating of these counters and the assigning of timestamps will be governed by the 
following rules: 

1. Whenever a new request comes, if a buffer is empty it is assigned to this request element. The Next- 
Write TimeStamp counter is assigned to this request and the counter is incremented. 

2. If all the buffers are full and there is a lower priority request exists in the bufferpool (next 
read and next write timestamps for each priority are not-equal), the newest lowest priority 
request (NextWriteTimeStamp) will be used as the TimeStampOverWrite. and that counter will 
be decremented. The NextWriteTimeStamp for the request's priority will be assigned to the 
request element and incremented. 

3. If there don't exist any free buffers and the lowest priority is greater than or equal to the cur- 
rent request element it will not be assigned to anything, its request enable signal will be inac- 
tive. 



Timestamps will be for each request element which comes in. If it is a 2 group request ele- 
ment, only a single timestamp will be assigned. 

Continuously, the following is re-evaluated: 

• All of the NextWriteTimeStamps and the NextReadTlmeStamps will be compared, if any are not 
equal the highest priority one will be chosen. The NextRead TimeStamp will be used along with 
the priority to seach the buffer pool. All of the buffers are compared against these values, and the 
one which matches will be output. 

• When the slot mapper logic needs a request element this one will be output (registered) and the 
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buffer cleared. When this happens the NextReadCounter will be incremented. 



Each of these counters will have the ability to be cleared (at the start of a row) and be 
incremented or decremented by 1 or 2 every clock cycle. Since there are only 96 request (or grant) 
elements the NextRead Counter will never get larger than 96. Likewise, when a new request comes in 
the NextWrite counter gets incremented, but when a buffer pool request is overwritten, the counter is 
decremented. As a result, the worst case is all traffic of the same priority. The counter will go up to 96 
and stay there since newer requests do not have preference over older requests at the same priority 
level. This is consistent with the oldest highest priority requests being forwarded to the next switch 
element in the fabric. The 'Next Read* timestamp is incremented each time a new buffer is played out 
of the buffer pool. This insures that the , first-come-first-serve* principle remains in effect. 

This module should be implemented using 2 counters for each priority (16 total counters @ 7 
bits wide) for a total of 1 12 FFs. 

Below is a block diagram showing the partitioning for the buffer pool logic. 
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Figure 7-14: Buffer Pool Partitioning Block Diagram 

7.2.2 Statistics Gathering 

RESERVED 



7.2.3 Grant Implementation 

The grants flow through the switch in an out of band network. The grant processing is the 
same as the request processsing, so the 'parser' and *buffer pool* logic will be re -instantiated on the 
grant path. The only difference will be that for the grant message, some logic needs to assemble the 
grant elements, Just like in the datapath there is a request assembler. This logic will take slot data 
from the synchronizers and assemble grant elements. On the output end, there will be a mapping 
function which takes grant elements and maps them into the correct slots. Then, these slots will be 
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converted into a serial stream. 



Grant 

Slots 

[12) 



gnt.element 
gnCstart ^ 



Grant 
DeMapper 






Grant 
Parser 















LinkO 
Grant Buffer 
Pool 



Link 1 

Grant Buffer 
Pool 



Link n 
Grant Buffer 
Pool 



Link 12 
Grant Buffer 
Pool 



Grant 
Mapper 



Grant 

Slots 

H2) 



t 'Dotted line represents . 
i the portion re-instantiated 
\ from the Request side. 



The Grant Serial link will be run at 2.2gbps, the same speed as the data path's serial links. 
The data path has 2 links which combine to source 1700 slots at a rate of 125M slots per second. 
Since the grant path has half the bandwidth it sources slots at a 75Mslot per second rate. 

Grant Elements arrive at a rate of 2 per every 3 slots. In every slot time on the grant link there 
are 4 clocks. Therefore, for every Grant, 1.5 slots will have occured, and there will be be 6 clock 
cycles. Since there are 12 links, 2 grant elements need to be processed every clock cycle to keep up 
with the input data rate. A minimal amount of buffering will be needed to hold the assembled grants 
to allow them to played out at the correct rate. 



1 grant row = 850 slots 
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Figure 7-15: Grant Link Mapping/DeMapping Timing 

The unused bandwidth between grant elements can be used to carry other traffic. Nothing is 
defined at the moment which uses this bandwidth (or can even have access to it), but it is nonetheless 
there. 

A group of 3 slots comprises 2 request or grant elements. The following diagram describes the 
mapping as well as the parity generationof the slots to arbitration elements. 



Figure 7-16: Grant/Request Element Slot Format 
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7.2.3.1 Grant Link Overhead 

There will normally be 10 slots used for Link Overhead (LOH). The mapper module will be 



Grant Link Overhead = 10 Slots ► 
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LOH 
Identifier 


LOH 
Framing 
Pattern 


CRC 


CRC 


LOH 
Status 
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DP Sync 
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Stuff 


LOH 
Stuff 


LOH 
Stuff 


LOH 
Stuff 



responsible for inserting contents of most of these LOH slots into the link data stream, these 10 LOH 
slots into the link data stream. There are 4 types of data which may be inserted into the LOH slots: 

LOH Framing Patter n - 

This will be a 36-bit value which is common to all output links. It will be Configurable via a 
software programmable register. This pattern will be used in only 1 of the 10 LOH slots. 

LOH Status - 

This 32 bit status field will be configurable via a software programmable register. There will be 
a unique status register for each output link. 

Note: the 4 tag bits are fixed to all l's. 

LOH Identifi er - 

This 32-bit will contain an identifier for this switch & link. The field is made up as: 

• loh_id[3:0] = link number that the output mapper is instantiated as. 

• loh_idl27:41 = fTSE ID number which is SW configurable. 

• loh„id[3 1:28] s stage number the fTSE is programmed as. 
Note: the 4 tag bits are fixed to all l's. 

LOH Stuff- 

This 32-bit pattern will be inserted in the LOH slots which aren't used for framing, status, or 
ID. This pattern will be Configurable via a software programmable register and is common to all 
output links. 
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Note: the 4 tag bits are fixed to all l's 
LOH Stuff- 

This Is gotten from the deserializers on the datapath. The values are mapped Into a 36 bit slot 
as follows. The actual decoding of these bits is reserved for the Synchronizers Specification. Please 
consult it for the use and interpretation of these bits. 

loh_sync[31:26] = 0 

loh_sync[251 = sync_datal_synced 

loh_sync[24I = sync_dataO_synced 

loh_sync[23:17] = 0 

loh_sync[161 = sync_offset_count_valid 

loh_sync[ 15:0| = sync_offset_count 

Note: The 4 tag bits are fixed to all l's 

CRC- 

This field is inserted by the serlalizer unit. This is fixed, and must be the 2 slots directly 
following the framing pattern. The grant mapper ram should be programmed as IDLE for these slots. 



7.2.3.2 Grant DeMapper 

The grant DeMapper will accept slots and assemble grant elements for the grant parser. The 
following Is the top level I/O for this module. 

RESERVED 

Table 7-17: Grant DeMapper Top Level I/O 

The grant demapper will utilize a Ram which is addressed by the current slot number. The 
output of the Ram will indicate if the current slot is a valid grant element. The grant demapper ram is 
32 bits wide and has the following bits: 
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Figure 7-18: Grant DeMapper Ram 
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Figure 7-19: Grant DeMapper Timing 

The grant demapper ram allows a switch chip to be located In more than 1 stage of a switch 
fabric by programming the DeMapper Ram accordingly. The ram slot counter will be the data path slot 
counter with the LSB dropped to account for the 50% decrease in the number of slots per row. The 
ram will be programmed so that different input links can have their requests valid at completely 
different times as long as the time between 2 incoming slots allows both to be processed by the grant 
logic (16 clock cycles). The demapper ram has 3 bits dedicated per input grant link so that future 
expansion is possible. 

7.2.3.3 Mapper Ram 

Here is the top level I/O for the Grant Mapper Module: 
RESERVED 

Table 7-20: Grant Mapper I/O 
Table 7-21: 
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Figure 7-22: Grant Mapper Ram 

When the Mapper RAM indicates it is time to transmit a grant element, the highest priiority 
52-bit request element is fetched from the arbitration module. Since only 32-bits of the GE may be 
tranmitted in a single link slot, it will be necessary to add buffering within the Output Grant Mapper 
module which will buffer the extra bits which cannot be sent in the current slot. 

A 3-slot structure will be defined which will be used to transmit 2 52-bit GEs. The timing for 
fetching GEs is shown below. In this example, the 3 slot GE structure is programmed to be 
transmitted on output link slots N, N+l, and N+2. 
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Figure 7-23: Grant Element Timing 

There will be a programmable field which indicates the total number of requests or grants 
which can be made every row time. This value will be per output link and take into account the 
number of groups locked out for TDM traffic. After this value is reached, the output link will output 
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idles. This is indepenedent of the configuration ram mapping. By doing this, software has an easy way 
to do performance testing and network congestion by setting this value to something small. It also 
allows a system to overrequest or over-grant a link since arbiters further on down the line will 'knock- 
out' a certain percentage of requests. 

7.2.4 Request Arbitration Memory Map 
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0 


Num Requests Link 06 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 07 


0 


Num Requests Link 07 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 08 


0 


Num Requests Link 08 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 09 


0 


Num Requests Link 09 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 10 


0 


Num Requests Link 10 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 11 


0 


Num Requests Unk 1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 
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7.2.4.1 Link X Request Element Priority Statistics Registers 



31 27 


26 23 16 


15 11 


10 7 0 


Address 
Offset 


0 


Link oo Priority o Requests Draped 


0 


Link 00 Priority 0 Request Received 


0x9000 


U 


unx uu rftonty i Requests uropeo 


0 


Link 00 Priority 1 Request Received 


0x9004 


n 


1 tnb Aft Pf^*f imj o DaaiiiwIa Hmaa^ 

LJiiFv uu ~ nuniy & nBquosis uropeu 


A 
U 


Link 00 Priority 2 Request Received 


0x9008 


0 


Link 00 Priority 3 Requests Draped 


0 


Link 00 Priority 3 Request Received 


0x9000 


0 


Link 00 Priority 4 Requests Draped 


0 


Unk 00 Priority 4 Request Received 


0x9010 


0 


link 00 Priority 5 Requests Draped 


0 


Link 00 Priority 5 Request Received 


0x9014 


0 


Link 00 Priority 6 Requests Draped 


0 


Unk 00 Priority 6 Request Received 


0x9018 


0 


Link 00 Priority 7 Requests Draped 


0 


Unk 00 Priority 7 Request Received 


0x901 C 



This pattern repeats for each of the 12 links, the offset is 0x20 between different link's 
statistics registers. These registers are read - only. 

7.2.4.2 Link X Request Element Counters 

Address 



31 


27 26 24 


23 


16 


15 
















7 














0 


Offset 


0 


Unk 00 
Maximum Requests / Row 


0 


Link 00 
Num Requests Forwarded 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0x9160 



The Maximum Requests per row field has a reset value of 96 and is read-writeable. The 
number of requests forwarded refers to the previous row and is read-only. 



7.2.5 Grant Arbitration Memory Map 

The grant arbitration has a number of configurable registers which are used for configuration 
of the switch element, statistics gathering and performance monitoring, and testability. These 
registers are outlined below: 



Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



Grant Mapper Ram Slot 0 



0x0000 
0x0004 



0x0008 
OxOOOC 



0x1A88 
0x1 A8C 



0x1 A90 



0x4000 
0x4004 

0x5A88 
0x5A8C 



Grant Mapper Ram Slot 1 



Grant Mapper Ram Slot 849 



unmapped address space 



Grant DeMapper Ram Slot 0 



Grant DeMapper Ram Slot 849 
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Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



unmapped address space 



Grant Link 00-1 1 Status Reg 



Grant Link Framing Pattern 



Grant Unk 
Framing 
Pattern 



read only- all zeroes 



Grant Link Common 'Stuff' Reg 



Grant Unk 00 Line Overhead Status Reg 



Grant Link 1 1 Line Overhead Status Reg 



Grant Max Grants 



Grant Capture Interrupt Status Register 



Grant Link 0-11 Capture Register Contents 



Grant Link 0 - 11 Capture Register Masks 



Gnt Config 



000000000000 



Grant Parity Error Masks 



Grant Error Interrupt Mask 



Grant Error Interrupt Status 
Register 



Grant Mapper Sequence Error 



Grant Parity Error 



Grant De Map per Sequence Error 



Grant Oest Error Elemental :00] 



Grant Dest Error Element[47;32]] 



reserved 



Table 7-24: Grant Memory Map (Shamelessly copied from Ch. 15) 
7.2.5.1 Grant Mapper RAM 



Address 
0 Offset 



0xSA90 

0x8000 
0x802C 

0x6030 

0x8034 

0x8038 

0x8O3C 

0x8066 

0x806C 

0x8070 

0x8074 

0x8078 

0x807C 
0x80A8 

0x80AC 
0x8008 

0x80DC 

0x80E0 

0x80E4 

0x80E8 
0X80EC 
0x80F0 
0x80F4 



Address 



31 28 


27 24 


23 20 


19 16 


15 12 


11 8 


7 














0 


Offset 


Grant Link 00 


Grant Unk 01 


Grant Link 02 


Grant Link 03 


Grant Unk 04 


Grant Unk 05 


0 


0 


0 


0 


0 


0 


0 


0 


0x0000 


Grant Link 06 


Grant Unk 07 


Grant Link 08 


Grant Unk 09 


Grant Unk 10 


Grant Unk 1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0x0004 


Unknown 


Reset 
Value 



The Grant Mapper Ram specifies the outgoing slot numbers that grant elements will be put 
upon. It Is critical that this ram be written before the switch fabric Is made operational. The reset 
value of this register is unknown. Any location may be written to or read at any time by the master 
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processor or the host interface. However, it is recommended that the processor update the values of 
this ram when traffic is either not going through the link, or during a slot location far away from the 
current slot number. Bits 7-0 will always read back a zero. 



Mapper Ram Bit Coding 


Value 
(Ob) 




0000 


Idle 


0001 


GE.0 


0010 


GE.2 


0011 


GE.3 


1000 


LOH Framing Pattern 


1001 


LOH Status 


1010 


LOH ID 


1011 


LOH Stuff 


1100 


LOH Sync 



7.2.5.2 Grant DeMapper RAM 



31 


28 


27 


24 


23 


20 


19 


16 


15 


12 


11 


8 


7 














0 


0 


Grant Link 
00 


0 


Grant Link 
01 


0 


Grant Link 
02 


0 


Grant Link 
03 


0 


Grant Link 
04 


0 


Grant Link 
05 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Grant Link 
06 


0 


Grant Link 
07 


0 


Grant Link 
08 


0 


Grant Link 
09 


0 


Grant Link 
10 


0 


Grant Link 
11 


0 


0 


0 


0 


0 


0 


0 


0 


Unknown 



Address 
Offset 

0x4000 
0x4004 



Reset 
Value 



The Grant DeMapper Ram specifies the incoming slot numbers that grant elements will be 
coming in upon. It is critical that this ram be written before the switch fabric is made operational. The 
reset value of this register is unknown. Any location may be written to or read at any time by the 
master processor or the host interface. However, it is recommended that the processor update the 
values of this ram when traffic is either not going through the link, or during a slot location far away 
from the current slot number. Bits 7-0 will always read back a zero. 



Mapper Ram Bit Coding 


Value 
(0b) 




000 


Idle 


001 


GEO 


010 


GE.2 


011 


GE.3 


1000 


Capture Register 



ML 
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7.2.5.3 Grant Link Status Register 



Address 



31 




29 










24 


23 


22 












16 


15 


14 












8 


7 


6 












0 


Offset 


fifo 

fa! 


0 


Current Fifo 
Watermark 


0 


Grants Received Last Row 


0 


Grants Dropped Last Row 


0 


Grants Forwarded 
Last Row 


0x8000- 
0x802C 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Reset 
Value 



There is a grant link status register for each of the 12 outgoing links. Addres offset 8000 is for 
link 0. while 802C is for link 11. The link status register is read only. Writes do not have an effect on 
these registers. The sub fields are defined as: 

PifoPill: This bit shall be set if the grant fifo is currently filled. 

Current Fifo Watermark: This shall read back the current number of grants that are 
buffered. 

Grants Received Last Row: Total number of grants received during the last row. This is 
updated every row upon an EndOfRow. This is independent of the number of PDUs which the grant 
was supposed to reserve. 

Grants Dropped Last Row: Total number of grants dropped during the last row. This is 
updated every row upon an EndOfRow. This is independent of the number of PDUs which the grant 
was supposed to reserve. 

Grants Forwarded Last Row: Total number of grants which were forwarded during the last 
row. For GE's which reserve more than 1 PDU this is the total number of PDUs reserved. 

7.2.5.4 Grant Link Framing Pattern 



Address 



31 








27 24 


23 














16 


15 














8 


7 














0 


Offset 


FrameO 


0x8030 


0 


0 


0 


0 


Framel 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0x8034 



The grant link framing pattern is 36 bits wide and is made by concatenating frameO with 
framl such that: FramingPattern[35:0] = Framel[27:24],Frame0[31:0l.. The reset value of Frame 0 is 
0x0F628_0000 and Frame 1 is 0. So that the framing pattern is OxOF628_0000. 

7.2.5.5 Grant Link Common 'Stuff Reg 



Address 

31 27 24 23 16 15 8 7 0 Offset 



Stuff Reg 


0x8038 

Reset 
Value 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



This register's contents will be inserted into the link overhead 'stuff slot designated by the 
mapper ram. This register is read/writable at any time. Writes are effective immediatly. 
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7.2.5.6 Grant Link Line Overhead Status Register 



Address 



31 








27 






24 


23 














16 


15 














8 


7 














0 


Offset 


Link 0-1 1 Overhead Status Register 


0x6030 
0x8068 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Reset 
Value 



This register's contents will be inserted into the link overhead's status slot designated by the 
mapper ram. This register is read/writable at any time. Writes are effective immediatly. 

7.2.5.7 Maximum Grants Register 



Address 



31 








27 






24 


23 














16 


15 














8 


7 














0 


Offset 


Link 00 Maximum Grants 


Link 01 Maximum Grants 


Link 02 Maximum Grants 


Link 03 Maximum Grants 


0x8060 


Link 04 Maximum Grants 


Link 05 Maximum Grants 


Unk 06 Maximum Grants 


Link 07 Maximum Grants 


0x8070 


Link 08 Maximum Grants 


Link 09 Maximum Grants 


Link 10 Maximum Grants 


Link 1 1 Maximum Grants 


0x8074 


0 


1 


1 


0 


0 


0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


0 


Reset 
Value 



The Maximum Grants Register may be written/read to at any time. It's reset value is 96 for 
each link (corresponding to the maximum number of grants which any of the links can accomodate). 
It is recommended that the number of groups available for PDU traffic at any given time be written 
into these positions to ensure that no mapper/demapper programming error or service processor 
error allow the introduction of more grants than PDUs that can be carried on the data link. 

7.2.5.8 Grant Capture Interrupt Status Register 



31 



27 



24 23 



16 15 



Address 
0 Offset 



0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Grant Capture Interrupt Status Register 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



0x8078 

Reset 
Value 



If any of the grant capture registers change from one row time to the next, the corresponding 
link bit in this register will get set. Bit 0 is for Unk 0, bit 1 1 for link 11. The other bits are unused and 
should always read back zero. Once the bit has been set, software must write a logical 1 to it to clear 
the offending bit. Whenever any of these bits are set the capture interrupt will become active. To clear 
the interrupt, the processor must clear the offending bit in this register by writing a 1 to it. Writes of 
'zero* have no effect. 



7.2.5.8.1 Grant Link Capture Registers 



Address 

31 27 24 23 16 15 8 7 0 Offset 

























Grank Link 0 


• 1 1 Capture Registers 
























0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



0x807C 
0x80D8 

Reset 
Value 



These are 12 read only registers which have reset values of zero. Each time the grant 
demapper encounters a 'Link Overhead Capture' Entry the slot data is stored in this register. These 
registers may be written to at any time, writes have no effect. 



?3 
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7.2.5.9 Grant Link Capture Mask Registers 



Address 



31 








27 






24 


23 














16 


15 














8 


7 














0 


Offset 
























Grank Link 0 - 


1 1 Capture Registers 
























OxSOAC 
0x8008 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Reset 
Value 



These 12 registers provide bit masks for the capture interrupt register. If any bit is set to a 1, 
then the hardware will monitor that bit in the link's capture register from row to row. If that bit 
changes, the grant link capture change register will latch the link whose bit has changed and output 
an interrupt to the processor. Every bit in each of the 12 input links can be set or cleared individually 
to trigger the capture interrupt. This register's reset value is all zero's which inhibit the interrupt from 
occuring. The register may be written/read to at any time. 

7.2.5.10 Grant Configuration / Grant Parity Error Mask Register 

Address 
Offset 



31 








27 






24 


23 














16 


15 














8 


7 














0 


Grant Configuration Register 


























Grant Parity Error M< 


ask 








0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



OxSODC 



Reset 
Value 



Grant Parity Error Masks - bit is active hi to enable parity conformance of incoming grants. 

Bit 0: Link 00 
Bit 1: Link 01 
Bit 2: Link 02 



Bit 11: Link 11 



Grant Config 

Bit 7: Grant Rotate Enable: Reset Value is 0, Set to a 1 to enable grant parser rotation. 

Bit 6: Disable Num Field (grants are forced to single PDU reservation mode). Default enables 

grants to reserve more than 1 PDU. 



7.2.5.11 Grant Error Interrupt Mask Register 



Address 

24 23 12 11 0 0ffset 



Grant Error Interrupt Mask 


OxSOEO 

Reset 
Value 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



Bit 31: Grant Start Signals Unaligned Error Mask: Program to 1 to enable an interrupt to 
occur if link demappers are out of sync with respect to the start signals which they create, 
this type of error. 

Bit 30: Grant Minimum Start Pulse Error Mask: Reset Is 0, Program to 1 to enable this type of 
error. This error mask is ineffective for grants since there is no minimum pulse period (it is a 
holdover from the request parser). 
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Bit 29: Grant Remaining Mask: At the end of a row, if grants are remaining in the buffers, this 
will trigger an interrupt when set to a l f reset value is 0. 

Bit 28: Grant Fifo Filled Mask: If the grant fifo overflows for a link this will trigger when pro- 
grammed to a 1, reset value is 0. Fifo watermarks need to be investigated for the link. 
Bits 23-12: Grant Mapper Sequencing Mask: Set to a 1 to allow sequencing errors to generate 
an interrupt. Bit 23 refers to Link #11, bit 12 refers to Link #0. 

Bits 11-0: Grant Demapper Sequencing Mask: Set to a 1 to allow sequencing errors to gener- 
ate an interrupt. Bitl 1 refers to Link #1 1. Bit 0 to Link #0. 

7.2.5.12 Grant Error Interrupt Status Register 

Address 

31 24 23 12 11 7 0 Offset 



Grant Error Interrupt Status 



reserved 



0x80E4 



Bit 31: Grant Start Align Error - signals alignment error, write a 1 to clear this interrupt 
source 

Bit 30: Grant Start Min Error - signals that grant elements have arrived too quickly. Write a 1 
to clear this interrupt source. 

Bit 29: Grants Remaining - signals grants are still in the fifo at the end of the row. Write a 1 
to clear this interrupt source. 

Bit 28: Grant Fifo Filled - signals that a grant fifo has overfilled. Write a 1 to clear this inter- 
rupt source. 

Bit 27: Grant Destination Error - siganls that a grant element has attempted to goto an out- 
put it isn't supposed to. Write a 1 to clear this type of interrupt source. 
Bit 26: Grant Parity Error - signals to read the grant parity error register 
Bit 25: Grant Mapper Sequencing Error - signals to read grant sequencing error register 
Bit 24: Grant Demapper Sequencing Error - signals to read grant sequencing error register 

7.2.5.13 Grant Parity Error Register 



31 



24 23 



12 11 



reserved 



Grant Link Parity Status 



Address 
0 Offset 



OX80E8 



Bit 11: Link 11 
Bit 10: Link 10 



Bit 0: Link 0 

When a grant parity error is detected, the register should be read to determine which link had 
the error. Writing a 1 to the bit in this register which is causing the interrupt, will clear the interrupt 
as well as the bit. 

7.2.5.14 Grant Sequencing Error Register 

Address 



31 






24 23 


16 








12 


11 


7 


0 


i.i. 


0 


0 


Mapper Sequencing Status 


0 


0 


0 


0 


DeMapper Sequencing Status | 



Offset 
Ox80EC 



When a Sequencing Error is detected, this register should be read to determine the input or 
output link which has generated the error. Write a 1 to the offending bit to clear the interrupt. 
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Bit 24: Output link 1 1 

Bit 16: Output Link 0 
Bit 11: Input Link 11 

Bit 0: Input Link 0 
7.2.5.15 Grant Destination Error Register 



Address 



31 


24 23 


16 


15 


8 7 


0 


Offset 


Stored Grant[47:16] 


OxSOFO 


Stored Grants 5:0] 


reserved 


0X80F4 



4 



This read only register allows the system to read back a grant which was caused by a wiring 
configuration error. Any grant which does not adhere to its input link 'imask' configuration will be 
latched in this register and cause an interrupt (if enabled to do so). If multiple links have errors on 
them, only the 'last link* (ie higher numbered) will be captured. 



7.2.6 Software Notes 

RESERVED 



Proprietary and Confidential Information ofOnex Communications Corporation 



8 Control Protocol 

RESERVED. 

8.1 Host Interface 

RESERVED. 

8.2 In-Band 

RESERVED. 



8.2.1 Message Format 

RESERVED. 

8.2.2 Method of Operation - Transmitting ASIC 

RESERVED. 

8.2.3 Method of Operation - Receiving ASIC 

RESERVED. 
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9 Clock Definitions 

The Switch ASIC has four main functional clock domains: core.clk, global_hyte_clk, 
link_byte_clk f and link^seiial_clk (The Switch also supports JTAG and the associated TCLK domain). 
These clocks are generated from two primary input reference clocks: link_ref_clk and sw_ref_clk. A 
top level view of the clock domains is shown in Figure 9-1. The link_seriaLclk clocks are generated 
within the receive core and transmit core. This are shown in Figure 9-2 and Figure 9-3. 




linkbyte.clk 



9 receive cores 



-h 



receive 
core 




link^ref_clk 



sw_ref_clk 




linkref_if_clk 



core elk 



global_byte_clk 

r-*-*--ir- n 



V ► 



link_byte_clk 



I/O logic 



core logic 



transmit 
core 



transmit 
core 



9 transmit cores 



transmit 
core 



Figure 9-1: Switch ASIC Clock Domains 
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vSJ 



linkjrefJLclk 



PLL (multiply by 4) 



llnK.serlal_clk 



ChAlnl 



ChBlni 



ChCin 1 



Ch D inl 



receiver 






receiver 






receiver 






receiver 





counter 



deserializer 



deserializer 



deserializer 



deserializer 



linK.byte.clkA 
linK.byte.clkB 
linK.byte.clkC 
linK.byte.clkD 



1 Receiver Core 
i 1 



U 



Figure 9-2: Receiver Core Clock Scheme 
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link_serial_clk 



linK.byte.clkA 
link_byte_clkB 
HnkJ)yte_clkD 
link_byte_clkC 



link^refjfclk 



PLL (multiply by 4) 



counter 



serializer 



serlallzer 



serializer 



serializer 





driver 






driver 






driver 






driver 



Ch A out 



Ch B out 



Ch C out 



Ch D out 



Transmitter Core 
i i 

Figure 9-3: Transmitter Core Clock Scheme 



9.1 Sw_ref_clk 

This is the reference clock used to generate core_clk. This clock is multiplied by TBD to create 
core_clk. 

9.2 Link_ref_clk 

This is the reference clock used to generate link_ref_if_clk. This clock is multiplied by TBD to 
create link_ref_if_clk. The selection of this clock is based on the desired frequency for the 
link_serial_clk. There are, however, two additional requirements. The llnk.ref_clk must have a 
frequency that is an Integer multiple of 72 KHz and must be frequency locked to the generation of the 
global synchronization signal sor_sync. The need for this requirement is discussed in Section 10. 

9.3 LInk.ref Jf.dk 

This clock is used as a reference clock by the transmitter and receiver cores to generate the 
serial clocks and the byte clocks. As is shown in Figure 9-2 and Figure 9-3, the transmitter and 
receiver cores contain a multiply by 4 of the incoming reference clock. Therefore, link_ref_if_clk must 
be operating at 1/4 the rate of the desired serial interface frequency. Assuming the following: 

• 72 KHz row time 

• 1700 slots (@36 bits per slot) per row 

• row data Is parsed across 2 serial lines 

With the above assumptions, the minimum serial frequency is 2.2032 GHz. This implies a 
link_ref_if_clk frequency of 2.2032/4 = 550.8 MHz. 

9.4 Link_serlaJ_clock 

The llnk_serial_clock is generated within each transmitter and receiver core. As discussed 
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above, this clock must have a minimum frequency of 2.2032 GHz to support the data throughput 
requirements. r * s^ut 

9.5 Link.byteclk 

The link^byte^clk is generated within each transmitter and receiver core. Each serial Input or 
output channel generates a distinct linkjjyte.clk. This clock operates at 1/8 of the link w serial_clock 
and is used to latch the byte data to the transmitter core and from the receiver core. All 
UnkJ>yte_clk , s are frequency locked, but there is no guaranteed phase relationship. 

9.6 Global.byte.clk 

The global_byte_clk is generated by dividing link_refjf_clk by 2. Thus, it is the same 
frequency as the linkjjyte.clk's. but there is no phase relationship. However, because they are 
sourced from the same reference, global.byte.clk and all the linkjbyte.clk's are frequency locked. 

9.7 Core.clk 
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10 Synchronization 

Within the fTAP architecture, there are two levels of synchronization. The lower level is a point 
to point synchronization between serial channels. Each serial channel transmits a framing pattern at 
a user programmable rate (the default is 72 MHz), each receiving serial channel will then synchronize 
itself to the framing pattern. Once this occurs, the serial channel is considered synchronized. More 
specific details of this level of synchronization are provided in Section 11.2. A higher level of 
synchronization involves synchronizing multiple data links within the Switch itself, and at the highest 
level synchronizing all of the Switches and Port Processors. The methodology to do that is discussed 
in this section. 

The iTAP architecture requires that row data move across the switch fabric in lock step 
fashion. During each row time (72 MHz), a Switch ASIC will be transmitting the previous row's data 
and receiving the next row's data. All switching within the Switch ASIC is based on the fact that the 
input row data is slot aligned prior to switching. To achieve global synchronization, a globally 
distributed synchronization pulse, sor_sync, is provided to all Switch ASICs. (The Port Processor 
ASICs are synchronized with the assistance of the Switch ASICs, see Section 10.5). Sor.sync is a 72 
KHz signal with the following system requirements: 

• Worst case delta between any Switch ASIC receiving the sor_sync pulse must not exceed 16 ns. 

• The minimum high time and minimum low time of the sor_sync pulse must be greater than two 
global_byte_clk clock cycles. 

• The generation of the sor_sync pulse must be sourced from the same reference clock that is used 
to generate the link_ref_clk to ensure a signal that is frequency locked to global_byte_clk. 

As mentioned above, the sor_sync pulse is allowed some delta in its arrival time to each 
Switch ASIC. With this signal being used as a reference to start the transmission of row data. It is 
obvious that row data arriving at a destination switch will not be slot aligned. In addition, there are 
possible electrical deltas in the routing of the serial lines. Because the electrical delay delta between 
Switch input ports effects the overall delay to compensate for, the following system requirement is 
placed on the serial LVDS lines to/from the Switches: 

• Between any two Switches, the worst case electrical delay from a transmitting device to a 
receiving device must not exceed 16 ns. 

In addition, the global sor_sync signal must be synchronized to each Switch ASIC. This may 
result in an additional one global_byte_clk cycle of difference between Switch ASICs. All of these 
possible worst case deltas are summed up and illustrated in Figure 10-1. A pictorial interconnect is 
show in Figure 10-2. 
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Figure 10-1: Worst Case Serial Data Arrival at Switch 
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Figure 10-2: Worst Case Switch Layout Scenario 



To compensate for the delta in the receipt of serial data, the Switch ASIC uses buffering logic 
to store data from each input port until it is guaranteed that all input ports have started to receive 
data. At this point in time, data can begin to be switched. This, however, creates an additional 
problem. By delaying the start of the switching relative to the start of row implies that an entire row of 
data cannot be switched by the end of the row time. The solution is that the link overhead field of the 
row is not switched. The link overhead field contains information that is used between connecting 
devices only and is generated by the output port logic. 

10.1 Internal Synchronization 

Once the Switch ASIC is operational (i.e. following PLL spin up, Tensilica boot, etc.), the input 
port control logic will be instructed from the core to initiate its synchronization. To do this, the Switch 
ASIC samples the sor.sync signal in the globaLbyte_clk domain. When the first rising edge is detected 
on the sor.sync signal, the Switch ASIC will start an internal counter, switch_sor_counter. Each row 
time, this counter counts from 1 to ROW_SIZE where ROW_SIZE is the number of bytes transfered on 
a serial channel during one row time. Several internal events are triggered based on the value of this 
counter. A summary of the programmable regsisters used to control these events can be found in 
Table 11-11. 

Due to electrical changes at the system level (based on voltage, temperature, etc.) and the 
fact that the sor_sync signal is asynchronous to global_byte_clk ( it is possible that the relative 
position of the rising edge of the registered version of sor.sync compared to the rising edge of 
switch_sor may vary from row to row. However, because the sor_sync signal is frequency locked to 
global_byte_clk, the delta should never exceed +/- 2 global_byte_clk cycles. Error logic within the port 
control logic will monitor the delta and set an error flag if the delta exceeds a user programmable 
value. 

Once the Switch ASICs have started their respective switch_sor_counters, they are all 
synchronized to each other within a predictable worst case delta. This worst case delta Is equal to 16 
nS plus one global_byte_clk period. The 16 ns is from the maximum delta on the sor_sync line (see 
above). The global_byte_clk period is due to the fact that the Switch ASIC port clocks are 
asynchronous to each other and asynchronous to the sor_sync signal. 

10.2 Data Channel Synchronization 

To achieve the necessary bandwidth, the data link is split across two serial lines or channels. 
The first stage of synchronization within the Switch is to synchronize each pair of channels. This 
compensates for any electrical delta between the channels that make up a link. This also places a 
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requirement on the electrical delta between the serial channels that make up a link: 

• Between any two serial data channels that make up a data link, the worst case electrical delta 
must not exceed 2 ns. 

The data alignment of two serial channels is accomplished using a ring buffer and is 
discussed further in Section 1 1.2. 

10.3 Slot Data Synchronization 

As mentioned previously, within the Switch ASIC all input ports must present synchronized 
slot data to the core. Each device {Switch and Port Processor) will transmit serial data based on its 
own internal sor signal. To compensate for the deltas in arrival time of incoming data as shown in 
Figure 10-1. each serial input channel has a data slot FIFO (see Section 1 1.2.2.1). As shown in Figure 
10-1, the worst case delta between any Input port Is 32 ns plus 1 global_byte_clk. Thus, the slot data 
FIFO must be large enough to buffer the quantity of data that can be written in this time period. 

To slot align the data inputs, all slot data FIFOs must be read simultaneously. To achieve this 
goal, a delayed version of the switch_sor signal is used to start the read. Based on the frequency of 
global_byte_clk and the information presented above, it is possible to determine a delay value relative 
to the switch_sor signal when slot data will be available from all input ports. For example, Figure 10- 
3 shows that for the worst case scenario if the receiving Switch ASIC waits "32 ns + 3 
global_byte_clkV from its switch_sor, then slot data from all ports will be available. The 32 ns value 
will translate into a certain number of global_byte_clk's once the frequency for global_byte_clk is 
selected. The offset will be programmable and the FIFO will be sized to accommodate the fastest 
possible global__byte_clk that will allowed. 
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Figure 10-3: Worst Case Slot Data Available in Switch 



There is also an additional consideration in determining the size of the slot data FIFO. 
Starting with the system layout shown in Figure 10-2. the sor.sync feeding the destination Switch is 
now the slow version as shown in Figure 10-4. Using the previous assumption that the Switch starts 
reading slot data at a time offset of "32 ns + 3 globaLbyte.clkV from the switch_sor, then the slot 
data FIFO must be large enough to accommodate "48 ns + 4 global_byte_clk , s" worth of data. This is 
shown in Figure 10-5. The reason for this addition requirement is because it is not know if the system 
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has the layout/ timing properties shown in Figure 10-2 or Figure 10-4. The switch must be designed 
to support both scenarios. 
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Figure 10-4: Alternate Worst Case Switch Layout Scenario 
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Figure 10-5: Worst Case Slot Data FIFO Requirement 



To support the require slot count of 1700 slots/row, the global_byte_clk must operate at 
275.4 MHz, This translates to a receive slot rate of 8.17 ns/slot. The global_byte_clk to slot ratio is 
2.25 gtobal_byte_clk's per slot. Therefore, assuming the 275.4 MHz global_byte_clk rate, the data slot 
FIFO needs to be 8 slots deep based on the following: 
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• 48 ns => 6 slots 

• 4 gk>bal_byte_clk's => 2 slots 

The slot numbers are rounded up to the nearest integer. With a FIFO depth of 8 slots, the 
Switch can support a maximum global_byte_clk rate of 291.6 MHz which translates to 1800 slots/ 
row. 

10.4 Grant Packet Synchronization 

The grant lines will synchronize in a fashion very similar to the data lines. One difference is 
the fact that there Is only one grant input per port and therefore there is no channel synchronization 
required. The grant line, however, does need to pass from the link_byte_clk domain to the 
global_byte_clk domain. Although a three stage ring buffer could do this, the grant channels will use 
the same four stage ring buffer being used by the data channels (for design simplification only). The 
grant lines are subject to the same constraints as the data lines and therefore the grant packet FIFO 
must be able to buffer the same amount of data. Therefore, the grant packet FIFO will also be sized at 
8 slots. 

10.5 Port Processor Synchronization 

As described above, all Switch ASICs receive a globally distributed sor_sync signal. This is 
used to synchronize their start of row timing. The Port Processor ASICs do not receive such a signal. 
In the iTAP system, it is possible for the Port Processor to be remotely located from the switch fabric 
and it would be impossible to meet the 16 ns worst case delta on the sor_sync line and the 16 ns 
maximum electrical delay to the Switch ASICs. The Switch ASICs, however, must still receive row data 
from the Port Processors that is synchronized, within the limits discussed above, to the globally 
distributed sor_sync line. 

10.5.1 Switch ASIC Requirements 

Similar to the Switch, the Port Processor has an internal counter used to maintain a 72 KHz 
synchronization signal. Following initialization, the Port Processor will start its 72 KHz 
synchronization counter and begin transmitting idle frames on its data links. As part of its standard 
input channel synchronization (see Section 11.2.1), the Switch input logic will synchronize to the 
framing pattern on the data links. Once synchronization is complete, the Switch input link(s) are now 
synchronized to the Port Processor start of row. There is a valid region centered about the switch_sor 
signal in which it is acceptable for the Port Processor start of row to be located. This is shown in 
Figure 10-6. If the Switch is receiving a start of row from the Port Processor within the window shown 
in Figure 10-6, then the Port Processor is synchronized to the Switch within acceptable limits. If not, 
then the Port Processor needs to be adjusted. 
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Figure 10-6: Valid Port Processor Start of Row Window 



To adjust the Port Processor, the Switch must measure the difference between switch.sor and 
the received start of row from the Port Processor. The delta, in terms of global_byte_clk*s, is always 
measured from the rising edge of switch_sor to the rising edge of the Port Processor start of row, as 
shown in Figure 10-7. The delta shown in Figure 10-7 is measured every row time. The measured 
delta is registered in the sync_offset_countN register (where 0<= N =< 11, one for each port) which is 
readable by the Tensilica processor. 
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Figure 10-7: Determination of sync_offset_count 

To aid with synchronizing the Port Processor, two bits are passed to the Port Processor in the 
link overhead field of the grant channel. These bits and their definition are summarized in Table 10-8. 



Table 10-8: Port Processor Synchronization Bits 



ppjsynced r. 


3_no^_syrced 


Comments 


0 


0 


Switch has not determined if Port Processor is synced to Switch. This case only 
occurs when the Switch has not synchronized (or lost sync) to the data link. 


0 


1 


Port Processor start of row is not within acceptable limits of switch_sor. 


1 


0 


Port Processor start of row is within acceptable limits of switch_sor. 


1 


1 


Invalid. Hunt the designer down like a wild animal. 



Once the data link is synchronized, the Switch will check the position of the Port Processor 
start of row as shown in Figure 10-6. If the start of row falls within the allowable range, the pp_synced 
bit is set. Otherwise the pp_not_synced bit is set. In addition, the measured offset as shown in Figure 
10-7 (sync_offset_count) is also inserted into the link overhead of the grant channel. (A complete 
definition of the link overhead field can be found in Table 11-12). 

In the link overhead field of the data link, the Switch will monitor the pp_sync_done bit. This 
bit is set by the Port Processor once it has become synchronized to all Switches that it is connected to. 



10.5.2 Port Processor Requirements 

The Port Processor must be running an internal synchronization loop at a 72 KHz rate similar 
to what is being done in the Switch {see Section 10. 1). Following a reset, the Port Processor will start 
its internal synchronization counter and begin transmitting idle packets with a valid framing pattern 
on its data links. If two ports are being used, then both ports must be transmitting the idle packets on 
their respective data links. The Port Processor will monitor the two bits of the link overhead section 
identified in Table 10-8. 

If only one port is being used, then the Port Processor simply examines the pp_synced and 
PP_not_synced bits and determines a course of action. If the pp.synced bit Is set, then the Port 
Processor will set the pp_sync_done bit in the link overhead field of the corresponding data link. After 
this, the Port Processor Is good to go. If the pp_not_synced bit is set, then the Port Processor will 
adjust its internal synchronization based on the sync_offset_count value. This will cause the Switch 
to loose and then regain sync on the data link. Following this, the Port Processor should be 
synchronized to the Switch and the pp_synced bit should be set. If not, the adjustment procedure can 
be repeated. Once the Port Processor detects that the pp_synced bit is set, it must set the 
pp_sync_done bit. 

If both ports are being used, the method to determine how to adjust the internal 
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synchronization counter becomes more complicated. If both ports respond with the pp_synced bits 
set, then the Port Processor only has to set the pp_sync_done bit and is good to go. Otherwise, the 
Port Processor adjust its internal synchronization. When two ports are being used, the goal is to 
center the Port Processor's synchronization between the two offsets received. In general, this would 
imply averaging the two offset values received. This method works except for the scenario shown In 
Figure 10-9. Based on the figure, assume the row time to be 100 clocks. Based on a random start, the 
Port Processor's internal sor falls somewhere in between the sor for switch 1 and switch2. In 
particular, the Port Processor's sor is two clocks after switch l_sor and 4 clocks before switch2_sor. 
Thus the perfect adjustment for the port_processor_sor is to delay it by one clock and therefore center 
it between switch l_sor and switch2_sor. However, based on the fact that the sync_offset_count value 
is always measured from the switch_sor to the port_processor_sor, the following values are obtained: 

• switch l_sor to port_processor_sor = 2 

• switch2_sor to port_processor_sor = 96 

Averaging these two numbers gives a result of 49. which would be an incorrect adjustment. 
The correct method is as follows. Based on the information shown in Figure 10-1, the largest delta 
between sync_offset_counfs from two Switch ASICs is "32 ns + 1 global_byte_clk M . Based on the 
global_byte_clk frequency, the 32 ns will translate into some number of global_byte_clk's. If the delta 
between sync_offset_count*s from two Switch ASICs is greater than "32 ns + 1 global^byte.clk", then 
the scenario shown in Figure 10-9 has occurred. To calculate the adjustment, the row time (in this 
case 100) needs to be subtracted from the larger value (in this case 96). This value is then added to 
the smaller value and the result is averaged. Thus the algorithm is: 

if mag(sync_offset_countl - sync_offset_count2) > TBD programmable value 
adjust = (row_time - max(sync_offset_countl - sync_offset_count2) + 
(min(sync_offset_countl - sync_offset„count2))/2 

else 

adjust = ave(sync_offset_countl + sync_offset_count2) 

A positive adjust implies delaying the port_processor_sor and a negative adjust implies 
advancing the port_processor_sor. 
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Figure 10-9: Special Case of Port Processor Synchronization Adjustment 

Once the adjustment has been made, this will cause the Switch to loose and then regain sync 
on the data link. Following this, the Port Processor should be synchronized to both Switches and the 
pp.synced bit from both Switches should be set. If not, the adjustment procedure can be repeated. 
Once the Port Processor detects that the both pp_synced bits are set, it must set the pp_sync_done 
bit. 
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10.6 Error Monitoring 

The iTAP architecture requires that all ASICs (Switches and Port Processors) remain 
synchronized, within the limits previously defined. Based on the fact that all ASICs are frequency 
locked, once the system Is synchronized it should not drift out of synchronization. Provided that the 
Individual serial channels remain synchronized (l.e. consistent framing pattern), the only way for the 
devices to drift out of sync with respect to each other is if their link_ref_clk's (see Section 9) are not 
truly frequency locked. This is a design requirement of the iTAP system and this type of problem 
should only arise due to some type of hardware problem. The methods to detect these types of errors 
are discussed below. 

As mentioned in Section 10.1, once the Switch ASIC has its internal synchronization counter 
running, it constantly monitors the rising edge of the incoming sor_sync signal and compares it to its 
own switch_sor. Due to clock Jitter and crossing clock boundaries, it is possible for the sor_sync 
signal to move slightly with respect to switch_sor. A window of +/- 2 global_byte_clk*s is considered 
acceptable. If the rising edges of the signals vary more than this, an interrupt will be sent to the 
Tensilica. 

The synchronization between the Port Processor and the Switch is also constantly monitored 
by the Port Processor. Once the Port Processor is synchronized to the Switch(es), it should never see 
the pp_not_synced bit set. If this occurs, it implies that the Port Processor and the Switch are not 
frequency locked. 
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11 Switch Ports 

The Switch ASIC has 12 input ports and 12 output ports. Each Input port to the Switch 
consists of 3 serial lines: 2 for data with request and one for grant. Each output port from the Switch 
also consists of 3 serial lines: 2 for data with request and one for grant. These lines are implemented 
as LVDS, Low Voltage Differential Signaling. 

11.1 Serial Data Stream Format 

The structure of a data/request row is shown in Figure 2-2. Splitting the row across two serial 
channels results in the serial data stream shown in Figure 11-1. The payload section contains 840 
slots. The link overhead contains 10 slots (the contents of the link overhead field are listed in 
Section 1 1 .6). Added to the end of the serial stream is a discard field containing a user programmable 
number of bytes that are always discarded. The required row time is 72 KHz (see Section 10) and 
therefore the size (in bits) of the entire serial stream will determine the clock frequency for the serial 
interface. The discard field allows some flexibility in the selection of the serial clock frequency. 



1 row time = 72 KHz (13.89 uS) 
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Figure 11-1: Data/Request Serial Stream Partitioning 



11.2 Input Port 

A top level view of the input ports is shown in Figure 11-2. A description of the signals is 
provided in Table 11-3. Several of the output buses are replicate N times, where 0<N<1 1, one for each 
link. 



» *- — in r\r\r\r\ 



ft 



Proprietary and Confidential Information of Onex Communications Corporation 



data^channelOp 
data^channelOn 



grant_channelp 
grant_channeln 



Primary I/O 



12 



"7* 
12 



12 



data^channellp -y*- 
12 

data^channelln -7^- 



12 



12 
-7^ 





Serial 






Receive 






Logic 









Data 

Assembly 
Logic 



Misc. 

Control 

Logic 



12 

12 



start_of_rowNl2:01 

syncN[2:01 

dp_read_slot_data 
gnt_read_slot_data 

capture_data 



-t^- capture _grant 



J6_ 



to/from input port 
control logic 



dp_islot_dataJN 

gntJslot_data_lN 

, link_reC.if.clk 

giobal_byte_clk 

core_clk 

reset_cclk 
reset_gbclk 

programmable_values 

to/from core logic 



Figure 11-2: Switch Input Ports 
Table 11-3: Input Port Signal Descriptions 



Signal Name V 


tth Di 


ection 


Description/Comments 


Source/ 
Dest 


Active 
State 


data_channel0-1 p(n) 




input 


Differential serial lines used for data. Data and request 
are parsed across two channels to achieve the 
necessary 3.8 Gbps bandwidth. 




NA 


grant_channe!p(n) 




input 


Differential serial line used for the incoming grant 
information. 




NA 


start_of_row[2:0] 




output 


When active, indicates the receipt of the start of row on 
the corresponding serial channel. 




High 


sync[2:0J 




output 


When active, indicates that the corresponding serial 
channel is synchronized. 




High 


read_stot__data 




input 


A data slot will be read from the Input port for every 
core^clock in which read_slot_data is high. 




High 


read_grant_data 




input 


A grant packet will be read from the input port for every 
core_clock in which read_grant_data is high. 




High 


capture.data 




input 


When this signal is active, at the start of the next frame 
all data channel input FIFOs (associated with the 
respective port) will start writing data and continue to 
write data until the signal is inactive. 




High 
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Table 11-3: Input Port Signal Descriptions 



Signal Name V 


Uh Dti 


ecoon 


Description/Comments 


Source/ 
Dest 


Active 
State 


capture_grant 




Input 


When this signal is active, at the start of the next frame 
the grant channel input FIFO will start writing data and 
continue to write data until the signal is inactive. 




High 


tink_refjf_clk 


1 


input 


Clock used by the deserializer/clock recovery logic (see 
Section 9). 




NA 


global_byte_clk 


1 


input 


Clock used by the port logic (see Section 9). 




NA 


core_clk 


1 


input 


Clock used by the core logic (see Section 10). 




NA 


reset_cc!k 


1 


input 


ASIC reset, synchronized to the core clock. When 
active, all registers are synchronously placed into a 
know state. Reset must remain active for a minimum of 
TBD byte clocks. 




Low 


reset_gbclk 


1 


input 


ASIC reset, synchronized to the global byte clock. 
When active, all registers are synchronously placed into 
a know state. Reset must remain active for a minimum 
of TBD byte clocks. 




Low 


program mable.values 




Input/ 
output 


The input port has several programmable registers as 
defined in Table 11-11. 




NA 



11.2.1 High Speed Serial Input Logic Blocks 

All high speed serial Inputs have a common front end as shown in Figure 11-4. Basic 
operation is as follows. Serial data is converted to byte wide data and then passed through a 
descrambler to a four stage ring buffer. The ring buffer is used to compensate for differential arrival 
times of the serial data across two serial lines and to provide the ability to transfer across clock 
domains. The byte data is also passed to frame detection logic that finds a framing pattern and 
provides bit shift information to the deserializer. Prior to being used to communicate data, each line 
must be bit aligned at the receiver. Once this is done, the receiving logic must be synchronized to a 
framing pattern. After completing this, the serial link is considered synchronized. 

There are four different clock domains that are associated with the port logic: core_clk, 
global.byte.clk, link_byte_clk, and link_serial_clk. Detailed information about these clocks can be 
found in Section 9. Referring to Figure 11-4, the link_serial_clk is used in the deserialzer and clock 
recovery logic. The output side of the ring buffer is operated in the global_byte_clk domain. All other logic 
shown in Figure 1 1-4 is operated in the link_byte_clk domain. 



JO 2. 
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Figure 11-4: High Speed Serial Input Functional Block Diagram 

11.2.1.1 Deserializer 

Serial data enters a high speed deserializer. In this logic, a clock recovery of the transmitting 
clock is made and the serial data is converted to a byte wide output. The byte wide data is made 
available as well as link_byte_clk (/8 of the recovered clock). The logic also has an input, datasync, 
that allows a one bit shift of the output byte data every time the bit.shift line is toggled. This logic will 
be provided as IP by the ASIC vendor. 

11.2.1.2 Descrambler 

The descrambler uses a l+x A 6+x A 7 polynomial identical to the one used for SONET. All serial 
data is scrambled using a corresponding scrambler before being transmitted. The descrambler is reset 
to it's initial seed value of all l's at the start of every frame. As is shown in Figure 11-4, data is taken 
directly from the deserializer block to the Comparators block (used to detect the framing pattern). The 
two byte framing pattern is not scrambled on the transmit side and therefore it is not required to be 
de-scrambled on the receiving side. All other data is scrambled and must be descrambled before being 
written to the FIFO. 

11.2.1.3 Comparators 

The comparators compare the byte data to a programmable two byte framing pattern and 
provide a frame_detect output. To improve the speed at which framing synchronization can be made 
or re-established when lost, there are actually 8 parallel comparators. Each comparator is comparing 
the framing pattern to one of 8 possible shifted versions of the input data. Based on the comparator 
that detects the framing pattern, the deserializer can be instructed to shift the appropriate number of 
bits. 
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11.2.1.4 Frame Counters and Control 

The frame counter is used in conjunction with the frame detect to detect and ensure a correct 
framing pattern. Once a framing pattern has been detected, the frame counter will check to ensure 
that the framing pattern is detected in the same position of each incoming row of data. After receiving 
a programmable number of consecutive correctly placed framing patterns, the sync line will be set 
indicating that the interface is synchronized. Once the interface is considered synchronized, the 
start_of_row output will be pulsed at the start of each row of data. Any frame of data in which the 
framing pattern is not correctly detected will result in the generation of a framing_error. After a 
programmable number of consecutive framing errors, the sync line is reset and the interface is now 
considered not synchronized. At this time, the interface will attempt to re-establish sync. 

In addition to data, a start_of_row bit is also written to the ring buffer. The start of row bit 
will be set high to coincide with the first byte of data in each row. 

11.2.1.5 Ring Buffer 

The ring buffer serves two purposes. First, it allows the two data serial lines to be 
synchronized at the byte level. The necessity for this Is discussed below. Second, it provides a method 
to cleanly transfer data across clock boundaries. The write side frequency Is operating from a 
link_byte_clk clock. The read side is operating at the same frequency, but from the globaljbyte elk 
clock. There is no phase relationships between the clocks, but they are frequency locked such that 
overrun/underrun conditions will not occur within the ring buffer. 

The ring buffer is implemented using four stages. Two stages are required to cleanly transfer 
data from the link_byte__clk domain to the global_byte_clk domain. A third stage provides buffering to 
allow a one link_byte_clk clock difference between the receipt of serial data on the two serial lines 
comprising the data link. This is required because on the transmitting side, the linkbyte.clk's are 
frequency locked but not phase locked. A fourth stage is added to compensate for electrical deltas 
between the two serial lines comprising the data link. Assuming a byte rate of 275.4 MHz, an 
electrical delta of 3.6 ns can be compensated for with the additional buffer stage. 

11.2.2 Input Data Line Grouping 

To support its maximum capability, the iTAP architecture requires a bandwidth of 3.8 Gbps 
on each data/ request port. 

Data/request is always presented to the core as a 36 bit data slot plus an islot_start_of_row 
control bit. The is!ot_start_of_row control bit will be high when the corresponding slot is the first slot 
of the row. Otherwise, islot_start_of_row will be low. To assemble the byte data from the two serial 
channels, one byte is read from each ring buffer and concatenated into an 16 bit wide word with serial 
channel 0 always as the MSB. Reading from the ring buffer continues In this manner with the new 16 
bit word being concatenated to the LSB of the previous word. The concatenated data is then parsed 
into 36 bit slots and stored in a FIFO. A separate top level control module (see Section 11.3) will 
control the reading of the slot data. 

The start of row control bit from all data channels is compared to ensure that all channels are 
aligned. If there is a discrepancy, an error flag is set and the misalignment counter for the 
corresponding data channel is incremented. The data misalignment counter is three bits and will stop 
Incrementing once it has reached its maximum value. An error counter size of only three bits is based 
on the fact that a start of row misalignment can only be caused by the data channel loosing sync or a 
spurious glitch in the data assembly logic. In the first case, loosing sync implies that the channel is 
down, all data is voided, and the channel will be reset once sync is re-established. In the latter case, a 
single error over an extended period might be in the realm of possibilities. However, several errors in a 
row probably indicates that the data assembly logic is out of sync and needs to be reset. 

As was shown in Figure 11-4, the write side of the ring buffer is in the link_byte_clk domain. 
The read side of the data channel assembly logic operates in the core.clk domain. All other logic 
associated with Figure 11-5 operates in the global_byte_clk domain. 
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Figure 11-5: Channel Assembly for Serial Data Channels 



11.2.2.1 Data Slot FIFO 

The slot FIFO serves two functions: transferring slot data from the global.byte.clk domain to 
the core_clk domain and provide buffering to allow slot alignment of all input ports. As documented in 
Section 10.3, the slot FIFO should be sized at a minimum of 8 slots. The read and write domains of 
the slot FIFO are totally asynchronous and have no relationship to each other. Based on the clocking 
requirements (see Section 9) it is guaranteed that the read rate is faster than the write rate. Therefore, 
the FIFO logic does not need to support a "full" condition. To ensure correct operation, Johnston 
counters will be used to monitor read and write pointers. 



11.2.3 Grant Packets 

The grant channel is handled in a similar manner to the data channels. The main difference is 
that the grant interface consists of only one serial channel. Therefore, the slot assembly only requires 
assembling and partitioning data from one serial stream. As a result of this, the grant slot FIFO needs 
to only be 4 slots deep (see Section 10.4). The read and write domains of the slot FIFO are totally 
asynchronous and have no relationship to each other. Based on the clocking requirements (see 
Section 9) it is guaranteed that the read rate is faster than the write rate. Therefore, the FIFO logic 
does not need to support a "full" condition. To ensure correct operation, Johnston counters will be 
used to monitor read and write pointers. 



11.3 Input Port Control 

To support the iTAP architecture, the core logic must receive input data that is synchronized 
across all 12 Input ports (data and grant). To achieve this, a top level input port controller is used to 
synchronize events across all input ports (the synchronization methodology is described in detail in 
Section 10). A top level view of the input port control is shown in Figure 11-6. A description of the 
signals that interface to the core is provided in Table 1 1-7. A description of the signals that interface 
to the input port logic is provided in Table 1 1 -3. 
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Figure 11-6: Top Level View of the Input Port Control Logic 



Table 11-7: Input Port Control Logic Signal Descriptions 



Signal Name V 


Uh Dii 


ection 


Description/Comments 


Source/ 
Dest 


Active 
State 


port_clock 






Clock used by the port logic (see Section 9). 




NA 


core_c!ock 






Clock used by the core logic (see Section 9). 




IMA 


reset 






ASIC reset. When active, all registers are 
synchronously placed into a know state. Reset must 
remain active for a minimum of TBD byte clocks. 




Low 


idslot_num 


11 


output 


Indicates the current input data slot number. 






idslot_phaseO 


1 


output 


This signal is asserted during the first cclk cycle for 
each new incoming data slot period. 




High 


idslot_phase1 


1 


output 


This signal is asserted during the second cclk cycle 
for each new incoming data slot period. 




High 


idslot_data 


36 


output 


Current input data slot contents 




NA 



/oC 
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Table 1 1-7: Input Port Control Logic Signal Descriptions 



Signal Name V 


Hth Di 




Description/Comments 


Source/ 
Dest 


Active 
State 


idslot_row_end 


1 


output 


This input is asserted during the last data slot of the 
row. If the serial input links are configured to carry 
more than 1700 slots per row, then this input will be 

aecortoH fnt cl/"itc 1RQQ anH htnha? QnoaHinn im tha 
aaocf IcU lUf alUlo 1099 dfiu niyncr. OfJocUlitij UyJ Iilc 

input links will only be done as part of the IC charac- 
terization, not intended for normal operation. 




High 


igslot_num 


11 


output 


Indicates the current input grant slot number. 






igslot_phaseO 


1 


output 


This signal is asserted during the first cclk cycle for 
each new incoming grant slot period. 




High 


igslot_phase1 


1 


output 


This signal is asserted during the second cclk cycle 
for each new incoming grant slot period. 




High 


igslot_data 


36 


output 


Current input grant slot contents 




NA 


igslot_row_end 


1 


output 


This input is asserted during the last grant slot of 
the row. If the serial input links are configured to 
carry more than 850 slots per row, then this input 
will be asserted for slots 849 and higher. Speeding 
up the input links will only be done as part of the IC 
characterization, not intended for normal operation. 




High 


sor_sync 






Primary input sor_sync. 




NA 


switches or 






Internal synchronization signal (see Section 10.1). 




MA 


p rog rammable_valu es 






The input port has several programmable registers as 
defined in Table 11-11. 




NA 


valid.data 






For every core_clock in which valid_data is high, the 
output slot_data contains valid slot data and must be 
latched in the current core_c!ock cycle. 




High 


valid_grant 






For every core_clock in which valid_grant is high, the 
output grant.data contains a valid grant packet and 
must be latched in the current core_dock cycle. 




High 


core_ready_for_data 






When active, this signal indicates that the core is ready 
to receive slot data. 




High 


core_ready_for_g rant 






When active, this signal indicates that the core is ready 
to receive grant packets. 




High 


errorjnformation 






Error information available to the core as described in 




NA 


select_error_information 






Selects which error information is made available to the 
core. 




NA 



11.4 Output Ports 

A top level view of the output ports is shown in Figure 11-8. A description of the signals is 
provided in Table 11-9. 
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Figure 11-8: Switch Output Port 
Table 11-9: Output Port Signal Descriptions 



Signal Name 


Description/Comments 


Active 
State 


data_channel0-1 


Differential serial lines used for data. Data and request are parsed across both 
channels to achieve the necessary 3.8 Gbps bandwidth. 


NA 


grant 


Differentia) serial line used for the out going grant information. 


NA 


data.source 


Selects if the data transmitted on the serial data/request channels will be slot data 
(=1)or the idle pattern (=0). 


NA 


grant_source 


Selects if the data transmitted on the grant channel will be grant data (=1) or the idle 
pattern (=0). 


NA 


insert Jraming_pattern 


When this signal is active, the user programmable framing pattern will be inserted 
into all the output serial channels. 


High 


load_data 


When this signal is active, the output port logic will latch in the current value on the 
slot_data lines. 


High 


load_grant 


When this signal is active, the output port logic will latch in the current value on the 
grant_data lines. 


High 


seriaLclock 


Clock used by the deserializer/clock recovery logic (see Section 9). 


NA 


port_clock 


Clock used by the port logic (see Section 9). 




core_clock 


Clock used by the core logic (see Section 9). 




reset 


ASIC reset. When active, all registers are synchronously placed into a know state. 
Reset must remain active for a minimum of TBD byte clocks. 


Low 


slot_data[37:0] 


The 36 LSBs ([35:0]) contain the out going slot data. The MSB ([35]) is high when 
the corresponding slot data contains the start of row. 


NA 


grant_data[48:0] 


The 48 LSBs ([47:0]) contain an out going grant packet. The MSB ([48]) is high when 
the corresponding grant packet contains the start of row. 


NA 


programmable_values 


The input port has several programmable registers as defined in Table 11-11. 


NA 



Proprietg^and Confidential Information of Onex Commissions Corporation 



11.4.1 High Speed Serial Output Logic Blocks 
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Figure 11-10: High Speed Serial Output Functional Block Diagram 

11.4.1.1 Serializer 

This logic will be provided as IP by the ASIC vendor. 

11.4.1.2 Scrambler 

The scrambler uses a l+x A 6+x A 7 polynomial identical to the one used for SONET. The 
scrambler is reset to it's initial seed value of all Ts at the start of every row. The scrambler function is 
bypassed for the two bytes containing the framing pattern (i.e. the framing pattern is not scrambled). 

11.4.1.3 Link Overhead Buffer 

The link overhead buffer provides the source for the link overhead section of the serial data 
stream (see Section 11.1). The buffer can be sized at a maximum of 19 slot which are all accessible to 
the Tensilica processor. Currently the buffer is sized at TBD slots. Table 11-11 and Table 11-12 
identify the currently defined usage of the link overhead. 

11.5 Output Port Control 

11.6 Link Overhead Information 

As shown in Figure 11-1, all serial channels have an associated link overhead section. This 
information is discussed in Section 4.3.1.5.3. Because the link overhead data is not switched, it can 
only be used as a method to communicate between neighboring devices. The I/O logic makes available 
the synchronization status of each input serial line to be placed into the link overhead. This allows 
neighboring ASICs to know the synchronization status of their respective out going serial channels. 

In addition, the sync_offset_count and sync_offset_count_valid values are made available to 
be inserted into the grant channel link overhead. More information about these fields can be found in 
Section 10.5. 



11.7 Programmable Registers 
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Each Switch porfnas associated with it the user programmable registers identified in Table 
11-11. The Port/Chip column indicates if there is one register per port (P) or one register per chip (C). 



Table 11-11: Switch Port Programmable Registers 



Name 


Size 
<b3s) 




Description 




frame.size 


11 




Size of a row in bytes for the serial data channels 


frame_pattern 


16 


0xF628 


The value to be used as the framing pattern 


lostsyncjoopjimit 


4 


Oxa 


The number of consecutive framing errors required before a 
synchronized serial channel is considered out of sync 


pre_sync_count 


4 


0x3 


The number of consecutive correct framing patterns required before a 
serial channel is considered synchronized 


gate_capture_data_position 


13 


0x30 


Set to a value that ensures that the start of row from all inputs has 
been received 


tota]_data_slotJimit 


11 




Set to one less than the total number of data slots in a row 


valid_data_slot_limit 


11 




Set to two less that the total number of valid data slots in a row 


data_rowJoggle_position 


11 




Set to one less than the slot number in which dp_data_row_toggle 
should toggle 


total_grant_sIotJimit 


11 




Set to one less than the total number of grant slots in a row 


valid_grant_slotJimit 


11 




Set to two less that the total number of valid grant slots in a row 


grant_row_toggle_position 


11 




Set to one less than the slot number in which gnt_data_row Joggle 
should toggle 


data_framing_pattern_position 


13 




Set = (position of the first byte of the framing pattern in the out going 
byte stream + start_output_position + 4). 


g rant_f ram ing_pattern_positio n 


13 




Set = (position of the first byte of the framing pattern in the out going 
byte stream + start_output_position + 4). 


data_sor_offset 


13 




Set = (frame_size - dataJraming_pattern_position + 2) 


grant_sor_offset 


13 




Set as (frame.size - grant_framing _pattern_position + 2) 


start_output_position 


13 


0x1 


Instructs the switch when to start outputing data. The default value of 
one should be used unless lab testing is being done. 


start_oslot_position 


13 




Set = (frame.size - 10) for normal operation. This value must be 
changed if the core clock and byte clock ratio are changed for testing. 


datajinkjnput_enable 


12 


0x0 


Active high to enable input data links. One bit per link. 


grantjink_input_enable 


12 


0x0 


Active high to enable input grant links. One bit per link. 


□aia_iinK_ouipui_enaDie 




uxu 


Mciivc nign io enauie output uaia units, vjne on per unit. 


grantjin k_output_enable 


12 


0x0 


Active high to enable output grant links. One bit per link. 


disable_scrambler 


1 


0 


Active high to disable the scrambling function. Should only be used for 
Verilog simulations. 


disable_descrambler 


1 


0 


Active high to disable the descrambling function. Should only be used 
for Verilog simulations. 


bypass_sync 


1 


0 


Active high to bypass the synchronization loop counters. A serial 
channel will be synchronized as soon as it detects the framing 
pattern. Should only be used for Verilog simulations. 


swrtch_sor_reset 


1 


0 


Active high to cause the internal synchronization engine to re- 
synchronize (see Section 10.1). 
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Table 11-11: Switch Port Programmable Registers 



Name 


Size 
(Ms) 


Defadt 


Description 


sIot_size_select 


2 


0x0 


Sets the slot size for the serial links. Only used for testing with 
different serial frequencies. 
00-36 bits 
01 -40 bits 

10- 44 bits 

11- 48 bits 



11.8 Core Interface Timing 

The interface timing between the core and the input logic is documented in Section 4.3. 1.2. 
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13 Sbus Modules 
13.1 SPI Interface 

Hie SPI interface is a 4 wire serial interface developped by Motorola for low speed 
synchronous serial communications. For the iTAP chipset, the SPI interface will be used to connect a 
serial eprom to the system for purposes of loading the local microprocessors as well as for storing 
certain state information in non-volatile memory. 

The iTAP SPI module supports the following features: 

• Configurable Speed settings- SOOKHz, 1MHz. 2MHz and 4MHz (w/ nominal 250MHz clock) 

• Automatic Page Buffer Management 

• Configurable Page Buffer Size 

• 2 SPI Devices (I.e. 2 chip selects) 

• Interrupt Driven 'Request' Based mode. 

• 16 and 24 Bit Address Mode Compatible 

The SPI bus developed by Motorola allows for 4 different modes of operation. They are 

1 distinguished by when the clock starts to toggle with respect to a peripheral's chip select and which 

2 ed g e of the clock to transmit/receive data on. In the Motorola SPI these are governed by the CPHA and 
' CPOL bits. Out of these 4 modes, vendors (Atmel, Xicor, Microchip) for serial eeprom devices support 

* modes 0 and 3 (0.0 & 1. 1). In both of these modes, data is transmitted on the negative edge of the 
~ clock and received on the positive edge. The difference in the modes is that the clock in mode 0 should 
I be low when cs is active before the transfer begins, mode 3 allows for a negative edge transition of the 
& clock. A mode 0 compatible boot loader will be sufficient. If a mode 1 or 2 device needs to be hooked 
Z up, an external inverter will be needed on the clock line. In addition, vendors of the proms have 
j developped faster versions of the chips which aren't bound to the original 2MHz specification. 

13.1,1 Theory of Operations 

s, The 1TAP SPI has 2 modes of operation, normal and interrupt driven. Both are discussed 

below. 

J 

L - 13.1.1.1 Normal Mode 

*■ 

* In 'Normal Mode' of operation the SBUS master can perform reads and writes on the SBUS by 
J accessing memory from 0x4000 to 0x400000. This allows for a 4 megabyte external device to be 

supported. In this mode, the SPI looks like a chunk of memory which is just very slow. Behind the 
scenes, the SPI controller takes care of enabling the EEPROM for writes and reading the SPI status 
register. Because of this, the Tensilica can have its reset vector in the SPI space, and boot out of it 
just as it would from any memory. 

When a Read is performed, the SPI controller will check to see if it has an open page, and if so 
close it. Then it will check the Status register to see if it is possible to perform a read. When the status 
register acknowledge the device is ready for an access, the read is performed, and the data transfered 
to the SBUS master. The time that it takes for this access could be as great as 12 ms, which Is the 
time it takes for an SPI program cycle. 

When a Write is performed, the SPI controller will check to see if a 'page' is currently being 
written, and to see if the address matches that of the current page. If it is, the data is written and the 
cycle terminates. Otherwise, the page needs to be closed, program initiated, the status register polled, 
and then the data written to the SPI. This could be as long as 12ms. The bus cycle ends as soon as 
the data is written into the SPI. but before a programming cycle is initiated to speed up transactions. 

13.1.1.2 Interrupt Driven Mode 

In this mode, the SBUS master will write to an address register, a data register (if a write) and 
a command register. This will initiate an SPI access. When the access is complete an interrupt will be 
signaled on. It is also possible to poll a Done bit to see if the transaction has been completed. 

The purpose of this mode is for the host interface to program the SPI and not need to have an 
impossibly long bus time-out value. Otherwise, each individual bus access would have a minimum 
time of several microseconds to 'dittle' the SPI bus and 12 ms to actually program a page into the SPI. 
In an iTAP system, the bus timeout will be much shorter (on the order of several dozen clock ticks 
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13.1.1.3 SPI Settings 

• SPI Serial Clock Speed- a 2 bit field to control the speed of the clock as a function of the core 
clock. With a nominal 250MHz core clock speed combinations from SOOKHz to 4MHz are 
possible. 

• SPI Addressing Mode- a single bit to control whether or not 16 or 24 bit addressing is used for 
the SPI prom. For the 64Kbyte and under devices, they need to have 16 bits written into them as 
address. For the larger devices, they need to have 24 bits written into their address buffers. The 
reset state of the register is determined by bootmode[2). 

• SPI Page Register Size- an 8 bit field to determine the number of consecutive addressed bytes 
that can be written into the SPI before the page has to be closed for writing. The reset value of 
this register is zero, since some small devices do not have a page register at all. 

• SPI CS- a single bit to determine which of the SPI chip selects will be active. Hie reset state of 
this register is determined by the bootmodelOl pin. 



13.1.2 SPI Memory Map and Registers 



: i: :-.:!Address;;: f\ 




0x0000 


SPI Status Register 


R/W 




0x0004 


SPI Write Enable Latch 


W 


8 


0x0008 


SPI Write Dl Latch 


w 


8 


OxOOOC 


SPI Address 


R/W 


32 


0x0010 


SPI Data 


R/W 


32 


0x0014 


SPI Stat 


FVW 


32 


0x4000- 


SPI Memory 


R/W 


32 



13.1.2.1 SPI Status Register 




Consult SPI data sheet for a more complete description, as the BPI fields are part-specific. 
13.1.2.2 SPI Write Enable Latch 

















mm 




mo m 


»F'e»d:: 


X 


X 


X 


X 


x 


X 


X 


X 



Acceses to this address will generate a Write Enable Command on the SPI. 
13.1.2.3 SPI Reset Write Enable Latch 









.vii* • 




: ''III r- 






::!!:• 


Field 


X 


X 


X 


X 


X 


X 


X 


X 



Accesses to this address will generate a write disable command on the SPI. 
13.1.2.4 SPI Address (Interrupt Mode) 

This is a read/write 32 bit register. The upper 4 bits control whether or not its a read or write. 
Setting 
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S§§. BWE3 BWE2 BWE1 BWEO 0 



Address 



To perform a read, program BWE[3:0J = 0 f for a write operation set the bits which should be 
programmed. The following settings are allowed: 









0 


0 


0 


0 


Read 


1 


0 


0 


0 


Write bits [31:24] 


0 


1 


0 


0 


Write bits [23:16] 


0 


0 


1 


0 


Write bits [15:8] 


0 


0 


0 


1 


Write bits [7:0] 


1 


1 


0 


0 


Write bits [31:16] 


0 


0 


1 


1 


Write bits [15:0] 


1 


1 


1 


1 


Write bits [31:0] 



13.1.2.5 SPI Data (Interrupt Mode) 

This is a read/write 32 bit register. Data which is to be written to the SPI should be written 
into here. Data which is to be read will be read here. 

13.1.2.6 SPI Stat (Interrupt Mode) 

This register controls the interrupt based mode transfers to the SPI. 




o I 0 0 



Writes to the GO bit will intiate a SPI bus cycle. During read cycles, if this bit is set. the SPI 
access is in progress. An Interrupt will be generated on the hi to low transition of this bit. This 
interrupt may be masked via the interrupt controller. The interrupt bit number is TBD. 

13.1.2.7 MReset Register Applicable Bits 

The control bits for the spi register are located in one of the bytes of the MRESET register. 
(Master Reset Register), which is located at 0x45038. 



Address 

31 28 27 24 23 20 19 16 15 12 11 8 7 0 Offset 



Module Resets 


SPI Page Size 


SPIA 


SPLCS 


SPI 
Mode 


reserved 


BootMode 






XTCLX 


WRST 


0 


0 


0 


0 


0 


0 


0 


0 




















BootModefOl 


0 


0 










0 


0 


0 


0 


0 


0 


0 


1 



SPI.A: Set hi for 24 bit addressing, low for 16 bit addressing. Consult individual SPI EEPROM 
data sheet for the correct mode. 

SPI_CS: Set hi for CS0, low for CS1. 

SPI.MODE: Controls the SPI toggle clock speed (assuming 250MHz nominal core clock). 

00 - SOOKHz 

01 - 1MHz 
10 - 2MHz 
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11 - 4MHz 



13.1.2.8 Interface Notes 
• RESERVED 

13.1.3 Alternative Synchronous Serial Bus Protocols 

Several synchronous serial buses were considered, here are the pros and cons of each. 



Originally developed by Motorola, SPI requires four pins on the microcontroller for communi- 
cation between memory and CPU. This is more wires than either I2C or Microwire. However, 
SPI is the fastest protocol (up to 3 MHz). Also, because the processor handles all the commu- 
nication, you don't have to lose precious application memory to serial communications algo- 
rithms. 



Developed by Philips, I2C requires only two wires between CPU and memory device, consum- 
ing fewer traces than SPI or Microwire. Also, because the data transfer is latched, rather than 
edge-sensitive, I2C has high noise immunity. (It has been popular in automotive applications.) 
Its real disadvantage is speed. Though some I2C implementations can run at 1 MHz, its spec 
tops out at 400 kHz. 



Developed by National Semiconductor, Microwire treads a middle ground between SPI and 
I2C, both in transfer speed (2MHz) and required lines between processor and memory (three). 
The format in which the processor sends an address to the memory device is both a strength 
and a weakness. Specifically, the Microwire protocol requires that the processor send only the 
address bits needed (rather than sending a fixed 16 bits). That's good, because time isn't 
wasted in transferring unnee&ed bits. It's also bad, because some programmer has to wrestle 
with bit-twiddling address-generation code. 



There will exist an RS-232 compatible UART interface. An external level shifter will be used to 
interface it to a standard PC serial port. External IP should be gotten which implements this function. 
Some possible vendors of a UART: 

http://www.synopsys.com/products/designware/dw_fl_ds.html pesignWare) 

The IP will interface to the Peripheral bus by a simple bus translator. If it does not support a 
loopback mode, one will be added. 

13.2.1 Synopsys Design Ware UART 

The synopsys designware uart has the capabilitity to function as a 16550 which have small 
receive and transmit fifos to minimize processor overhead, as well as a lower performance mode where 
it emulates a 16540 which does not have these fifos. This design assumes that the 16550 mode will be 
used. The 2 fifos each need to be bytew eide, single port (1R.1W) rams which are 16 entries deep. 
Internally it has 12 registers to control it. Please consult the DesignWare documentation for a full 
data sheet on this part. 



13.2.2 Daisy Chainable Mode 

Since many switch chips may be populated on a single circuit board, it would be difficult to 
put a separate connector on the board for each switch chip. External circuitry could be designed to 
allow for this, or this functionality can be designed into the switch device. The latter has been 
choosen. 

Each iTAP Switch will support a daisy chainable RS-232 compatible serial port. There will be 
a command language that can be used to talk to the chips (which is not described in this document). 
This command language will issue commands along with Switch ID numbers such that each switch 
chip is individually addressable. There will also exist a 'shunt mode' so that this chain can be broken 



SPI 



i 2 c 



Microwire 



13.2 UART 



for diagnostics! or debug reasons. 

Hie loopback and dalsychain shunt will each have a bit to control whether or not the uart is 
in a particular mode. 
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Figure 13-1: Daisy Chained Serial Interface 

13.3 iTAP Switch Configuration Registers 

In addition to these peripherals there will be many configuration registers which are also 
accessible through the SBus. These are described below. 



Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Address 
Offset 

0x0000 



iLink Control 00 



iLink Control 01 



iUnk Control 02 



iLink Control 03 



iLink Control 04 



iLink Control 05 



iUnk Control 06 



IUnk Control 07 



IUnk Control 08 



* * ri r\r\r\r\ 



Troprfe^ ^^^^demffl/ Information ofOnex CommJ^ j^ns Corporation 



Base Address 0x00440000 



31 



24 23 



16 15 



iUnk Control 09 



iUnk Control 10 



iUnk Control 11 



Switch ID 



Switch Part Number 



MReset 



Bus Timer 



Host Mailbox Rags 



Core MailBox Rags 



Watchdog Control 



WatchDog Timeout 



Watchdog Service 



Interrupt Type 



Interrupt Mask 



Interrupt #0 Status 



Interrupt #1 Status 



Interrupt #2 Status 



Interrupt #3 Status 



Interrupt Level Register 



SSRAM Configuration 



Uart 



13.3.1 iLink Control Registers 



8 7 



Address 
Offset 



Address 



31 


28 


27 






24 


23 






20 


19 






16 


15 






12 


11 






8 


7 


0 


Offset 


reserved 


QJnk Mask 


Spares 




reservi 


3d 




Stage 
Numb 


} 

er 


0 


0 


0 | 0 | 0 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


0 


0 


0 


0 


0 


0 


0 


0 


0 | o | 0 


0 


0 


o | 0 


0 


Reset 
Value 



iLink Mask: Setting a bit enables an allowable switching configuration from this input link to 
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the 12 output links. Bit 27 refers to link 1 1, Bit 16 refers to Link 0. The reset value is all 1's 
so that any input is allowed to goto every output without generating an error condition. 
Stage Number: This 3 hit field sets the stage number for the iTAP Switch Element Link. The 
reset value is 0. 

13.3.2 ITAP Switch ID Register 



Address 

31 28 27 24 23 20 19 16 15 12 11 8 7 0 Offset 



Switch ID 


reserved 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



This register will hold the individual switch identification number. This number shall be 
unique to the switch matrix, and programmable by the internal master processor or through the host 
interface. This ID number is transmitted in one of the Link Overhead fields. 



13.3.3 iTAP Switch Part Number 



Address 

31 28 27 24 23 20 19 16 15 12 11 8 7 0 Offset 



Switch Part Number 


Switch Revision 




TBD 


0 


0 


0 


0 


0 


0 


0 


1 



Reset 
Value 



This is a 32 bit read only register which contains the switch revision and part number. This 
will allow multiple generations of devices to share the common code base. 

T his value is TBD 
13.3.4 iTAP Switch Reset Register 

Address 



31 






28 


27 






24 


23 






20 


19 






16 


15 






12 


11 






8 


7 














0 


Module Resets 


SPI Page Size 


SPI A 


SPLCS 


SPI 
Mode 


reserved 


BootMode 






XTCLK 


| WRST 


0 


0 


0 


0 


0 


0 


0 


0 


































0 


0 


0 


0 


0 


0 


0 


1 



Reset 
Value 



Module Resets: Write a logical 1 to any of these bits to bring that module out of reset. 

Bit 31: Master Processor 

Bit 30: DataPath 

Bit 29: Arbitration 

Bit 28: Multicast 

Bit 27: Serial / Deserialzer 

Bit 26: unassigned 

Bit 25: unassigned 

Bit 24: unassigned 

SPI Page Size, SPI„A, SPI_CS,SPI_MODE: All of these bits are described in the SPI section of 
this document. 
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13.3.5 iTAP Bus Control 
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Address 



31 






28 


27 






24 


23 






20 


19 






16 


15 






12 


11 






8 


7 














0 


Offset 


Full Decode 


Bus Timer 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


Reset 
Value 



Full Decode: Setting these bits force the target module to fully decode the address bus. 
Otherwise, the decoded module should never bus error during unmapped accesses. 

Bit 3 1 : MulUCast Controller 

Bit 30: SSRAM, Shared Ram 

Bit 29: Arbitration 

Bit 28: Serial / Deserializer 

Bit 27: Data Path 

Bit 26: SPI Address Space 

Bit 25: Switch Mailbox RAM 

Bit 24: Switch Rise Core Registers- DMA, IRQ, Switch Configuration Regs.etc. 

Bus Timer: A bus timer exists in the BIF which forces a termination after a presetable 
amount of time. All 24 bits are used when the SPI Is being addressed, only the lower 8 bits are used 
when any other module is being addressed. The units for the timer are in core-clock ticks. For a 
nominal 250MHz core clock frequency, use 4 ns per tick. The reset value is 0x40020. 

13.3.6 Host to Core Mailbox Interrupts 

Address 
Offset 



31 






28 


27 






24 


23 






20 


19 






16 


15 






12 


11 






8 


7 














0 


Host 2 Core Irq #0 


Host 2 Core Irq #1 


reserved 






0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 



Reset 
Value 



When the host writes to the mailbox interrupt bits, an interrupt will be signaled to the 
internal Tensilica master processor. The Tensilica shall be able to clear the interrupt via writing a 1 to 
the offending bit. There are 2 interrupts that these bits are wired to, eight for each Interrupt. In this 
way there will be a method to create a high and a low priority interrupt for mailbox message requests 
and acknowledgements. 

13.3.7 Core to Host Push Mailbox Interrupts 



Address 

31 28 27 24 23 20 19 16 15 12 11 8 7 0 Offset 



Core 2 Host Irq #0 


Core 2 Host Irq #1 


reserved 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 


0 


0 


0 


1 


0 


0 


0 


0 


0 



Reset 
Value 



The Tensilica can program any of these bits, which will generate a host interface interrupt. 
There are 2 groups of signals, each connected to a seperate interrupt. In this way there will be a 
method to create a high and a low priority interrupt for mailbox requests and acknowledgements. 

The bits are cleared, in general by the host inteface writing to its core 2 host interrupt 
register. 



13.3.8 SSRAM Configuration 
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See Rise Chapter on SSRAM 

13.3*9 Interrupt Configuration, Status Registers 

See Rise Chapter on Interrupt Controller 

13.3.10 Watchdog Timer 

See Rise Chapter on Watchdog Timer 

13.3.11 UART Registers 

See Uart Chapter 1 1 



The switch chip will have 2 reset pins. One Is for hardware reset, the other for software reset. 
The hardware reset will be used for power- up reset, power glitches and to re-initialize the entire 
system. Every register shall be cleared or set to its default value when the hardware reset is toggled. 
The software reset will actually cause an interrupt to the micrprocessor, which can then go and decide 
which subsystems should be reset. Upon a software reset, the local processor can do some 
housekeeping and saving certain state information as well as decide to warm boot or cold boot. 

The Tensilica can write to and reset any one of the bits in the reset register. The bits in this 
register need to be toggle bits... Writing a zero to them has no effect, writing a 1 to them will either set 
or clear it, depending on its state. 

13.3.12 Tensilica Reset Register 

There will exist a register which is only cleared upon a cold (hreset). This register will allow 
the Tensilica to know why it reset itself or was reset in the event of a watchdog timer time-out. 





wm 




1111 


§11 
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Field 


Software Defined 


Host 
Processor 
Reset 


Tensilica 
Internal 
Reset 
(other than 
NMI) 


Tensilica 

NMI 
caused 
Reset 


Watchdog 
Timer 
Warm 
Reset 


reserved 


reset 


Default Values TBD 


0 


r/w 


r/w 


r/w 


r/w 


r/w 


r/w 


r/w 


r/w 


r/w 


r 



Table 13-2: Microprocessor Software Reset Register 
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14.1 Overview 

The host interface is a parallel external interface to the iTAP Switch chip. This interface will 
be used to do the following: 

• Exchange messages with the iTAP Switch Tensilica Processor 

• Download boot code for the Tensilica Processor 

• Read and Write Configuration Registers 

• Fully operate the switch in applications where the fTAP Switch Tensilica Processor is not used. 
The host interface is designed to be a glueless interface with a Motorla 68360 operating in 

asynchronous mode with a 16 bit port size, it is, however, and asynchronous interface so any speed 
device could be used as long as the signals were compatible. Since there are literally hundreds of 
RAMs in the iTAP Switch, this interface will support a page mode to access these registers. The host 
interface will support a 16Kbyte memory space (13 address lines). This address space will be divided 
into 2 regions, a paged region which allows access to the internal switch registers and a •fixed* 
address region which contains host interface configuration registers and 4 mailboxes to exchange 
data with the iTAP Switch Tensilica. 

• Page Size is 15.5K bytes. 

• Fixed Contents is 5 1 2 bytes. 
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^Memb^-Spa^ 


0x0000 


Paged Memory 


Ox3E0O 


Fixed Contents 



Address 


. ^.'-Memory ^'Spacej- 


Ox3E0O 


Page Register 


Other 'Fixed' Registers 


0x3100 


Hi Prioritv Mailbox (IN) 


0x2140 


Low Prioritv Mailbox (IN) 


0x2180 


Hi Prioritv Mailbox (OUT) 


0x21C0 


Low Prioritv Mailbox (OUT) 



Table 14-1: Host Interface Address Map 

For normal use, it is expected that the host will communicate to the Switch via several 
mailboxes. These mailboxes will allow for the passing of low and high priority control messages to the 
local processor inside of the Switch. There is also a mode where the processor may not be used which 
gives the host full access to all of the iTAP Switch's registers. In fact, the host could run the switch 
and act as a surrogate local processor. 

This document outlines the register map of the host interface, then follows with a description 
of the interfaces which it will need need, followed by some implementation notes. 

14.2 Programmable Page Memory Map 

The following table is a summary of the valid page numbers, and what they page In. For the 



.:.:;:; : page Number 


; - j - ; ; n ^ - H 3 ^ h h U h =^ I : M : - r - f * : ! = - - ^or»^i& =^^:i-n:!^:l|h:h;:n E hMin=-i:^t 


000 


Undefined (mirror of page 1) 


001 


SPI 


255 


256 


Data Path UnkOCSR 


257 


Data Path link 1 CSR 
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258 


Data Path Link 2 CSR 


259 


Data Path Unk 3 CSR 


260 


Data Path Unk 4 CSR 


261 


Data Path Unk 5 CSR 


262 


Data Path Unk 6 CSR 


263 


Data Path Unk 7 CSR 


264 


Data Path Unk 8 CSR 


265 


DataPath Unk 9 CSR 


266 


Data Path Unk 10 CSR 


267 


Data Path Link 11 CSR 


268 


Data Path Global CSR 


269 


not currently assigned 


270 


not currently assigned 


271 


not currently assigned 


272 


Grant Mapper 


273 


Grant Demapper 


274 


Arbitration Stats & Regs 


275 


DMA 


276 


Misc Registers 


277 


Mail Box Ram 


278 


UART 






512 


External Ram Start 


767 


External Ram End 


768 


Shared RAM Start 



Table 14-2: Page Number Decoding 

14.2.1 Paged Memory Accesses 

The paged memory space will perform writes just as expected, but reads are more complex 
due to the nature of the data being read. Many of the iTAP switch registers will be counters, which 
may need to be bigger than 16 bits. To perform 32 bit reads without double buffering every counter, 
the following will occur: 

• Reads to the Paged Memory Address space, will load 32 bits into a set of registers. 

• The address +2 will also be stored. 

• The •correct* 16 bits will be output on the host interface data bus. 

• If the next host access matches the stored address then the other data word will be read out on 
to the host interface data bus. If the next host access does not have an address match another 
full read is performed. 

14.3 Fixed Page Memory Map 

The following memory map describes the 'Fixed Contents* part of the host interface address 

map. 



jitjjj'Page Number: 




0x3E00 


Page Register 


0x3E04 


Interrupt Mask Register 


0x3E08 


Interrupt 0 Control Register 
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0x3 EOC 


Interrupt 0 Control Register 


0x3E10 


reserved 


0x3E14 


reserved 


0x3E18 


Programmable Interrupt Level Reg 0 


0x3E1C 


Programmable Interrupt Level Reg 1 


Ox3E20 


Programmable Interrupt Level Reg 2 


0x3E24 


Miscellaneous Register 


0x3E28 


Message Status Registers 


0x3E28-0x3EFC 


reserved 


0x3F00 


Low Priority Mailbox (In) 


0x3F40 


Hi Priority Mailbox (In) 


0x3F80 


Low Priority Mailbox (Out) 


0x3FC0 


Hi Priority Mailbox (Out) 



Table 14-3: Fixed Contents Address Map 

14.3.1 Page Register 

This is a 32 bit read /write register. Writes have an immediate effect of what the lower host 
interface address space contents are. 

14.3.2 Interrupt Registers 

The host interface will need access to all of the same interrupt sources as the Tensilica. It 
must have the ability to receive the same interrupts as the Tensilica as well as have the ability to 
program different interrupts at different priorities. To facilitate this, the same Interrupt Controller 
that the Tensilica will be using will be instantiated here. This will give the host the greatest flexibility. 

The following registers are local to the Host Interface. The interrupt mapping will be the same 
as that given in Chapter 9, Interrupt Controller. Although each interrupt source is given 4 bits of 
priority encoding, only the lsb of each of the nibble fields will be used. 



Address: 



0x3E04 



0x3E08 



0x3E0C 
0x3E10 
0x3E14 
0x3E18 
0x3E1C 
0x3E20 



_ 00 : N CO 



:C 0) :'C0 N 10 



in • 



Interrupt Mask Register 



Interrupt 0 Register 



Interrupt 1 Register 
reserved 



reserved 



PI00 
PI08 
PI16 



PI01 
PI09 
PI17 



PI02 
PI10 
PI18 



PI03 
PI11 
PI19 



PI04 
PI12 
PI20 



PI05 
PI13 
PI21 



PI06 
PI14 
PI22 



PI07 
PI15 
PI023 



There will be 2 added interrupts (#22-23): 

• Hi Priority Mailbox Interrupt 

• Low Priority Mailbox Interrupt 



14.3.3 Misc Register 

The miscellaneous register controls the bus error modes for the switch. 
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28 
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0x3E24 


Misc Register 



ProprieU 



^^^^Confidential Information of 



?/ Onet Commul^^mns Corporation 



0 


0 


PMWL 


PMRL 


0 


0 


CRST 


PRST 



All of these bits are •toggle' bits, meaning writes of 0 have no effect on them. To clear or set 
any of these bits, write a logical 1 to them. 

PMWL - Page Memory Write Lockout 

If this bit is set, any writes to the paged memory will not return a dsack. This is to prevent the 
host from setting bits while in message-passing mode. In fabrics that do not use the JTAP 
Switch local microprocessor, this bit would be cleared. This bit powers up to a zero. 

PMRL - Page Memory Read Lockout 

If this bit is set, any reads to the paged memory will not return a dsack. This is to prevent the 
host from getting any status information while in message-passing mode. During debug, even 
in message passing mode this bit will generally be cleared to ensure that random diagnostics 
reads do not bus error the system. However, application software may want to set this bit to 
debug a system where a bad pointer may be reading an errant location. 

The reset register gives the host the ability to reset the iTAP switch chip, or the Switch chip 
microprocessor sub-system. 

CRST - Chip Reset 

Writing a 1 to this register will reset the entire chip, including the host interface when the 
access is complete. 

PRST - Processor Reset 

Writing a 1 to this register will place the Switch microprocessor into reset. To take the proces- 
sor out of reset the host interface will need to toggle this bit to zero by writing a 1 to it. The 
power up status of this register will be the value on the BootModell] pin (which is used to 
enable or disable initial booting of the internal Master Processor. 

14.3.3.1 Host Interface Bootstrap Mechanism 

When the iTAP Switch Processor is In reset (PRST is set) code may be safely downloaded to the 
boot ram. This is done by setting the page register to 'BootCode' and writing the boot image in. The 
image may be read back at any time to ensure that it was delivered successfully. When the host has 
correctly put the boot image into the BootCode space, it will then clear the PRST bit. At this time, the 
local switch processor will begin executing code from its reset vector. 

14.3.4 Message Status Registers 

The control structures for these mailboxes are located below in the fixed contents area: 



Address : 



0x3E28 



High Priority 
Message In 



Host to 
Master 



Master 
to Host 



High Priority 
Message Out 



Host to 
Master 



Master to 
Host 



Low Priority 
Message In 



Host to 
Master 



Master to 
Host 



Low Priority 
Message Out 



Host to 
Master 



Master to 
Host 



This register is divided into 4 sections, one for each mailbox- High Priority Message In, 
Message Out, and Low Priority Message In and Message Out. Each one of the mailbox status registers 
has 4 bits which the host can set that causes an interrupt on the master processor. There are also 4 
bits which the master processor can set which cause an interrupt on the host processor. The 2 High 
Priority Mailboxes are on interrupt #X and the 2 Low Priority Mailboxes are on interrupt #Y. It is 
expected that the host and master processor will program the high priority mailboxes on a different 
(and higher) interrupt than the low priority mailboxes to facilitate 2 classes of messages. To clear any 
of these bits, the host processor must write a logical 1 to them. This will ensure that if during the 
processor of clearing a bit the master processor attempts to set another bit. it is not overwritten. The 
hardware makes no restrictions or assumptions other than connecting one interrrupt to the High 
Priority Response bits, and other to the Low Priority Response bits. The bits are aligned such that the 
16 bit host interface can set and clear a mailbox status register in a single write cycle- i.e. clearing the 
interrupt condition as well as giving its response if software desires. 

Please consult the software documentation concerning the usuage of these bits. 
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14.3.5 Host Interface Mailboxs 

For messages In and out of the switch there are high and low priority mailboxes. The format of 
these mailboxes is as follows: 
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0x3F00 


In Mailbox, Low Priority- 64 bytes 


0x3F04 




0x3F08 




0X3F3C 




0x3F40 


In Mailbox, High Priority- 64 bytes 


0x3F80 


Out Mailbox, Low Priority- 64 bytes 


0x3 FCO 


Out Mailbox, High Priority- 64 bytes 



14.4 Interfaces 

The host interface has 3 ports- the 68360 external bus, the mailbox bus and the HIF 
interface. Each are described below. In the block diagram the address, data and byte enable control 
signals have been omitted for clarity. 



ds b 



we b 



cs_b 



^dsackb 



68360 Bus 
Interface 



Register Address 

bus and other 
control signals 
before using them 



4* 



dataw 



^ hdata^ 



hif cs 



Lack 



addr, dataw, we.b 



mx_cs 



hreg_cs 



ddir 



HIF Data 
(Lower Word) 



HIF Tag Address 



HIF State 
Machine 



^ Host Interface Bus ^ 



HIFdatar 



Mailbox 
RAM 



BidirectionaJ 
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Figure 14-4: Host Interface Block Diagram 

14.4. 1 Motorola 68360 Bus Interface 

The host bus interface will have the following pins: 



. : : : Piri Name 




I/O 


Description 


hcs.b 


1 


I 


Chip Select (Active low) 


hdsj> 


1 


I 


Data Strobe(Active low) 


hwe_b[1:0) 


2 


I 


Read=1,Write=0 
hwe_b[1l -> dbus bits[31:24] 
hwe_b[0] -> dbus bits[23:16) 
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haddr 


14 


I 


Address Pins 


hdata 


16 


B 


Data Bus 


irqO.b 


1 


0 


Interrupt 0 


irq1_b 


1 


0 


Interrupt 1 


hdsack_b 


1 


0 


Transfer Ack (Active low)-opendrain 



The 68360 needs to be programmed to output individual byte write-enables, as well as accept 
external data transfer acknowledges. Since the 68360 has a 32 bit data bus, the bus needs to be 
connected as follows: 



ft : 



a 



n 





iiiiil^ 


addr[13:1J 


haddr[12:0] 


data{31:16] 


hdatal15:0] 


ds_b 


hdsj> 


we_b[1) 


hwe_b[1) 


we_b[0] 


hwe_b[0] 


dsack_b[1J 


hdsack_b 


dsack_b[0l 




irqlxLb 


irqOJ) 


irq[yLb 


irql.b 



This will always signal a 16 bit port size to the 68360. Refer to 68360 User's Manual Table 4-2 
'DSACKx Encoding'. The 68360's dsackO pin should be pulled up. 



14.4.1.1 68360 Device Settings 

• The following registers will need to be set In It: 
PEPAR -> Bit 7, set to 1. Set this to T so that WE_b[3:0| are driven. Out of reset, the 68360 
trl-states these pins, so they will need pull-up resistors. 

SPS[l:01-> In the Option Register, need to be set to '1 1* for external dsack generation. 

14.4.1.2 Timing Diagrams 



0ns 
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Figure 14-5: Host Interface Write Timing Diagram 
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Ons 



25ns 1 50ns 

I I I I 1 I 



hcsx *\ r 

HDSx "A / 

Haddr 
HWEx[1:0] 



HDSACKx — \ r 



HDATA ( > 

Figure 14-6: Host Interface Read Timing Diagram 
Note: Since this is an asynchronous interface, the absolute timing is unimportant. 



14.4.1.3 Bus Calculations 

The time the 68360 spends waiting on the host interface can be calculated as follows: 
Worst Case Interface: 

•Local Registers- 0 ws 

•Mailbox Ram- 0 ws 

•SBus Interface- 2 interfaces w/ longword xfers ahead of host interface 
•FBus Interface- 1 lw, 1 line xfer ahead of the host interface 

Accesses to the FBus need to take into account the following delays: 
•Bring DSx into fTAP Switch Clock Domain (2clks) 
•Sample, assert HIF Request (2 elks) 
•Arbitrate for FBus. 

•DMA Line Access to external SSRAM =13clks 

•Processor LW Access to external SSRAM = 8clks 

•Host IF LW Access to external SSRAM = 8 elks 
•Handskaking to Dsack assertion = 3clks 

Total latency is 36 internal 250MHz clock cycles. This will incur - 150ns of latency on the 
bus, which at 50MHz is 8 clock cycles and at 33MHz is 5 clock cycles. 

The 2 typical cases are when the Tensilica is operating and the host inteface is using the 
mailboxes (which has 0 ws) and when the Tensilica is not operating and the host inteface is controlling 
the switch via the SBus. Sbus latency will be 2 long word acceses. Assuming the Sbus runs at 1 ws, 
the latency from DSx to DSackx will be 15 core clock cycles. At 250MHz this 60ns, and translates to 
2/3 (33/50 MHz) host interface wait states to write a 16 bit word. For reads to the 32 bit internal 
registers, 1 read cycle with the wait states will pre-load the second part of the word, so that it may be 
accessed with 0 wait states. 

14.4.2 HIF Bus Interface 

The HIF bus will have the capability of performing 32 bit read/write transfers to both the 
SBus and the FBus. It will not support any of the FBus bursting modes. This is because there is little 
buffering in the Host Interface and the 68360 bus interface can't keep up. The HIF will allow the 
external host to control the iTAP Switch chip Just as the embedded Tensilica processor can. All of the 
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same memory mapped registers and interrupts are available to it. This state machine will take the 
transactions from the 68360 bus and perform a bus conversion to a simplified Tensilica PIF bus. This 
bus will not support block reads since more buffering would be needed, and the expected interface 
speed (PCI translated to 68360 async i/f) does not have the bandwidth adequate to support block 
reads at 250MHz. 

14.4.3 Mailbox Ram Bus Interface 

The Mailbox ram interface gives the host fast access to a small dual port ram which it can 
communicate to the local Tensilica processor via a messaging scheme which is TBD. This is a single 
cycle RAM so there are no latency or acknowledge signals which need to be passed back to the 
controller. It will merely pass along the address, data and byte enable control signals. Since the 68360 
does not suport unaligned transfers the the host interface's 16 bit data bus will be replicated as it 
goes into the Mailbox Ram. 

This interface tightly couples onto an SRAM, there will be byte writes. The size of this ram will 
be a true dual port RAM that is 64 entries x 32bits wide. 

14.5 Host Interface Top Level I/O 









H : ; j ; i • j j j j i j \ f. Des^pt jpii i j \ \ ) ■ i : lf j ;||y|:j 


Host Interface Pins 


hcs_b 


1 


i 


Chip Select (Active low) 


hds_b 


1 


i 


Data Strobe(Active low) 


hwe_b[1:0] 


2 


i 


Read=1 t Write=0 


haddr[12:0] 


13 


i 


Address Pins 


hdata 


16 


B 


Data Bus 


irq0_b 


1 


0 


Interrupt 0 


irq1_b 


1 


0 


Interrupt 1 


hdsack_b 


1 


0 


Transfer Ack (Active low)-opendrain 


Mailbox RAM Interface 


maddr 


6 


0 


Mailbox address-long word aligned 


mdataw 


32 


0 


Mailbox Write data bus 


mdatar 


32 


I 


Mailbox Read data bus 


mwe 


4 


0 


Mailbox Write Enable 


mcs 


1 


0 


Mailbox CS 


HIP interface 


HIFCnt 


1 


0 


HIF Control Bits 


HIFValid 


1 


0 


HIFValid 


BIFCntl 


2 


I 


BIF Module Control Bits 


BIFValid 


1 


I 


BIFValid 


BIFReqRdy 


1 


I 


BIF Module Ready 


HIFAddr 


X 


0 


HIF Address 


HIFdataW 


32 


0 


HIF Write Data bus 


HlFdataR 


32 


I 


HIF Read Data bus 


HIFBe 


4 


0 


HIF Byte Enables 


Misc 


sysclock 


1 




System Clock 


reset_b 


1 


I 


Reset (Active Low) 


pirq 


8 


I 


Processor Interrupts 



formation ofOnex Commi 
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sleep 


1 


I 


Processor Sleep 



14.6 Notes 

1. 68360 can abort the bus cycle, make sure we can handle this correctly. 

2. Double buffer the control inputs, they're asynchronous. 

3. The 68360 16 bit port size will cause the following to be decoding: 



68360 
Address 
Bus 
[31:00] 







0x00 


0x01 




0x02 


0x03 



iTAP Switch 
Address Bus 



ws*mm 




m\i. 

m 


0x00 


wel 
0x00 


weU 
0x01 



0x04 


0x05 


0x06 


0x07 



0x02 



wel 
0x02 



"we0~ 
0x03 



Figure 14-7: Big Endian 16 bit 68360 Port Addressing 
The size of the host interface will be 

■ 64x32bit Dual Port Ram 

► 32 bit read buffers 

► 13 bit address tag buffer 

• Local Registers- 10 page bits, Irq~80, 10 misc, 25 for FSMs and output registering 

► Metastability - 34 FFs, (need to sample data bus at the correct time, so we don't need to register 
it). 

Total Flip Flops- 210 + Dual Port RAM of 2048 bits. 
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15 Configuration/Status Register List 

This section summarizes the Control Status Registers (CSRs) used to configure and monitor 
the operation of the iTSE. CSRs Include all configurable memory devices within the 1TSE, these 
devices may be individual flip-flops, register arrays or memory arrays. 

Table 15-1: Datapath CSRs 

Address 



31 24 23 16 15 8 7 0 Offset 

0x0000 



0x0004 



15.1 Memory Map: 

The memory map of the chip Is 32 megabytes. All registers are accessible via the host 
interface by setting the page register accordingly. 



Bus Mapping 


Block Description 


Address Range 


Host Interface 
Page Range 


Size 
(MB) 


SBus 


SPI Memory Space 


0x0000 0000 


0x000 


4 






0x003F FFFF 


OxOFF 




SBus 


DataPath CSRs 


0x0040 0000 


0x100 








0x0043 0000 


0x1 0C 




SBus 


Synchronizers 


0x0043 4000 


Ox10D 




SBus 


Arbitration 


0x0044 0000 


0x110 








0x0044 8000 


0x112 


4 


SBus 


DMA 


0x0044 C000 


0x113 




SBus 


Misc Registers 


0x0045 0000 


0x114 




SBus 


MailBox Ram 


0x0045 4000 


0x115 




SBus 


UART 


0x0045 8000 


0x116 




FBus 


External RAM 


0x0080 0000 


0x200 


4 






0x007F FFFF 


0x2FF 




FBus 


Internal Shared Ram 


OxOOCO 0000 


0x300 


0.032 






0x00004000 


0x301 
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15.1.1 Synchronizer Memory Map 



Base Address 0x00434000 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



frame_size 



0x0000 
0x0004 

0x0008 
OxOOOC 
0x0010 
0x0014 
0x0018 
OxOOlC 
0x0020 
0x0024 
0x0028 
0X002C 
0x0030 
0x0034 
0x0038 
0x0O3C 
0x0040 
0x0044 
0x0048 
0x004C 
0x0050 
0x0054 
0x0058 
OxOOSC 
0x0060 
0x0064 
0x0068 
0x006C 
0x0070 
0x0074 
0x0078 



lostsyncjoop 
.limit 



presyncjoop 



gate_capture_data_position 



eor_enable_position 



totaLdata_slot_Iimit 



va!icLdata_slotJim'rt 



data_rowjogg!e_poshion 



00000000000 



totaLgrant_slotJimit 



valid_grant_slotJimit 



grant_row_toggle_position 



00000000000 



data_framing_pattern_position 



granLframing_pattern_position 



data_sor_offset 



grant_sor_offset 



data_oslot_num_start_position 



grant_os!ot_nurn_.start_position 



datajink_input_enable 



grantjink_input_enable 



data_tink_output_enabIe 



grantJink_output_enabIe 



start_oslot_position 



0 0 0 0 0 



misc_control 



sync„changed_dd Jnt.mask 



sync_changed_dc0Jnt_mask 



sync_changed_gc_jnt_mask 



00000000000 



sync_error_limiLdc1_inLmask 



sync_errorjimit_dc0jnt_mask 



sync_errorJimit_gcjnt_rnask 



00000000000 



crc_errorJimit_dc1Jnt_mask 



crc_errorJimit_dc0_inLmask 



crc_errorJimit_gcjnt_mask 



00000000000 



data_siot_sync_error_int_mask 



granLsIot_sync - error_inLmask 



sync_status_dd 



sync_status_dc0 



sync_status_gc 



00000000000 



data_slot_sync_error 



grant_slot_sync_error 



sync_errorJimit_dc1 



sync_error_limit_dc0 



sync_errorJimit_gc 



00000000000 



crc__errorJimrLdc1 



crc_errorJimit_dcO 



crc_error_limit_gc 



00000000000 



pp_sync__offset_countJ0 



pp_sync_offset_count_11 



pp_sync_offset_countJ2 



pp_sync_offset_countJ3 



pp_sync_offset_countJ4 



pp_sync_offset_count_15 



PP-Sync_offset_countJ6 



pp_sync_offseLcount_l7 
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Base Address 0x00434000 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



0 


0 


0 


pp_sync_offset_countJ8 


0 


0 


0 


PP_sync_offset_counU9 


0 


0 


0 


pp_sync_offset_count_H 0 


0 


0 


0 


PP-sync_offset_countJ1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


inLmasks 


0 


0 


0 


0 


0 


0 0 0 0 0 0 Int.status 


0 


0 


0 


0 


0 


dp_is!ot_cmp 


0 


0 


0 


0 


0 


gnUstoLcmp 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcO_IO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcO_l 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcOJ2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_counLdcOJ3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcOJ4 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


!ost_sync_error_count_dcOJ5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


losLsync_error_count_dcOJ6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_dcOJ7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcOJ8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_counLdcOJ9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dcOJ1 0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


losLsync^erro^counLdcOJ1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_counLdc1 JO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


losLsync_error_coun1_dd J1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_dc1 J 2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dc1 J 3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_dc1 _I4 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sy nc_error_count_dd J 5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dc1 J 6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_dc1 J7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


losLsync_error_count_dc1 J 8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dd J 9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_dd J1 0 


o 


o 


o 


0 


0 


0 


0 


0 


0 


0 


0 


0 


losLsync_error_count_dd J1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


tost_sync_error_count_gc_IO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_cou nt_gcJ1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_cou nt_gcJ2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_cou nt_gc J3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error.count_gc_M 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_e rror_count_gc J5 



0x0 07c 

0x0080 

0x0084 

0x0088 

0x0100 

0x0104 

0x0108 

0x010c 

0x0110 

0x0114 

0x0118 

0x01 1c 

0x0120 

0x0124 

0x0128 

0x01 2c 

0x0130 

0x0134 

0x0138 

0x01 3c 

0x0140 

0x0144 

0x0148 

0x01 4c 

0x0150 

0x0154 

0x0158 

0x01 5c 

0x0160 

0x0164 

0x0168 

0x01 6c 

0x0170 

0x0174 
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Base Address 0x00434000 



31 



24 23 



16 15 



8 7 



0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_gcJ6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_gcJ7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_gcJ8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_coun1_gc_l9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Iost_sync_error_count_gcJ1 0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


lost_sync_error_count_gc_l1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcO_IO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcO J 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ4 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error__count_dcOJ7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJ8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcO J 9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc0J1 0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc0J1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dd JO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 J 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 J2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dd J 3 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 J 4 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 _I5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 J 6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dd J7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc 1 J8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_.de 1 J9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


ft 


am Armf MAI mf HO 

crc_crror_cuuni_uu i_i i w 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dc1 J1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_dcOJO 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_e rror_cou nt_gc J1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcJ2 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_cou nt_gcJ3 



Address 
Offset 

0x0178 

0x01 7c 

0x0180 

0x0184 

0x0188 

0x018c 

0x0190 

0x0194 

0x0198 

0x01 9c 

0x01 aO 

0x01 a4 

0x01 a8 

0x01 ac 

0x01 M 

0x01 b4 

0x01 b8 

0x01 be 

0x01 cO 

0x01 c4 

0x01 c8 

0x01 cc 

0x01 dO 

0x01 d4 

0x01 d8 

0x01 dc 

0x01 eO 

0x0164 

0x01 e8 

0x01 ec 

0x01f0 

0x01 (4 

0x01f8 

0x01 fc 
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Base Address 0x00434000 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcJ4 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_counLgc_l5 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcJ6 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_coun|_gcJ7 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcJ8 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcJ9 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error_count_gcji0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


crc_error.count_gc_I1 1 



0x0200 
0x0204 
0x0208 
0x020c 
0x0210 
0x0214 
0x0218 
0x021 c 



The Address Offset range 0x0300 - 0x04 1c has the same read values as the range 0x0100 - 
0x02 1c except that upon a read, the register Is cleared. A write to the range 0x0300 - 0x04 lc Is not 
allowed and will result in a bus error. 



Table 15-2: Synchronizer Misc Contol Bit Fields 



Bit* 


Field 


0 


disable_scrambler 


1 


disable_descrambler 


2 


bypass_sync 


3 


switch_sor_reset 


5:4 


slot_size_select 



15.1.2 Arbitration Memory Map 



Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



Grant Mapper Ram Slot 0 



0x0000 
0x0004 
0x0008 
OxOOOC 
0x1 A88 
0x1 A8C 
0x1A9O 

0x4000 
0x4004 



Grant Mapper Ram Slot 1 



Grant Mapper Ram Slot 849 



unmapped address space 



Grant DeMapper Ram Slot 0 
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Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Grant Link 
Framing 
Pattern 



Grant DeMapper Ram Slot 1 



Grant DeMapper Ram Slot 849 



unmapped address space 



Grant Link 00 Status Reg 



Grant Link 01 Status Reg 



Grant Link 02 Status Reg 



Grant Link 03 Status Reg 



Grant Link 04 Status Reg 



Grant Link 05 Status Reg 



Grant Link 06 Status Reg 



Grant Link 07 Status Reg 



Grant Link 08 Status Reg 



Grant Link 09 Status Reg 



Grant Link 10 Status Reg 



Grant Link 11 Status Reg 



Grant Link Framing Pattern 



read only- all zeroes 



Grant Link Common 'Stuff* Reg 



Grant Link 00 Line Overhead Status Reg 



Grant Link 01 Line Overhead Status Reg 



Grant Link 02 Line Overhead Status Reg 



Grant Link 03 Line Overhead Status Reg 



Grant Link 04 Line Overhead Status Reg 



Grant Link 05 Line Overhead Status Reg 



Grant Link 06 Line Overhead Status Reg 



Grant Link 07 Line Overhead Status Reg 



Grant Link 08 Line Overhead Status Reg 



Grant Link 09 Line Overhead Status Reg 



Grant Link 10 Line Overhead Status Reg 



Grant Link 1 1 Line Overhead Status Reg 



Address 
0 Offset 



0x4008 
0x400C 
0x5A88 
0x5A8C 
0x5A90 

0x8000 
0x8004 
0x8008 
OxSOOC 
0x8010 
0x8014 
0x8018 
0x801 C 
0x8020 
0x8024 
0x8028 
0x802C 
0x8030 
0x8034 

0x6038 
0x803C 
0x8040 
0x8044 
0x8048 
0X804C 
0x8050 
0x8054 
0x8058 
0x805C 
0x8060 
0x8064 
0x8068 



136 




Proprieta^^m Confidential Information of Onex Commi 
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ns Corporation 



Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Address 
Offset 

0x8O6C 

0x8070 

0x8074 

0x8078 

0x8070 

0x8080 

0x8084 

0x8088 

0x808C 

0x8090 

0x8094 

0x8098 

0x8090 

OxSOAO 

Ox80A4 

0x80A8 

OxSOAC 

0x8080 

0x60B4 

0x80B8 

0x80BC 

0x80C0 

0x8004 

0x8008 

OxSOCC 

0x8000 

0x80D4 

0x8008 

0x8000 

OxSOEO 

0x80E4 

0x80E8 
OxSOEC 



Grant Max Grants 



Grant Capture Interrupt Status Register 



Grant Link 0 Capture Register Contents 



Grant Link 1 Capture Register Contents 



Grant Link 2 Capture Register Contents 



Grant Link 3 Capture Register Contents 



Grant Link 4 Capture Register Contents 



Grant Link 5 Capture Register Contents 



Grant Link 6 Capture Register Contents 



Grant Link 7 Capture Register Contents 



Grant Link 8 Capture Register Contents 



Grant Link 9 Capture Register Contents 



Grant Link 10 Capture Register Contents 



Grant Link 11 Capture Register Contents 



Grant Link 0 Capture Register Bit Mask 



Grant Link 1 Capture Register Bit Mask 



Grant Link 2 Capture Register Bit Mask 



Grant Link 3 Capture Register Bit Mask 



Grant Link 4 Capture Register Bit Mask 



Grant Link 5 Capture Register Bit Mask 



Grant Link 6 Capture Register Bit Mask 



Grant Link 7 Capture Register Bit Mask 



Grant Link 8 Capture Register Bit Mask 



Grant Link 9 Capture Register Bit Mask 



Grant Link 10 Capture Register Bit Mask 



Grant Link 1 1 Capture Register Bit Mask 



Gnt Config 



000000000000 



Grant Parity Error Masks 



Grant Error Interrupt Mask 



Grant Error Interrupt Status 
Register 



Grant Parity Error 



Grant Mapper Sequence Error 



Grant OeMapper Sequence Error 
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Corporation 



Base Address 0x00440000 



31 



24 23 



16 15 



8 7 



Grant Dest Error Element[31 :00] 



Grant Dest Error EIement[47:32)J 



reserved 



unmapped memory space 



Request Link 0 Priority 0 Statistics 



Request Link 0 Priority 1 Statistics 



Request Link 0 Priority 2 Statistics 



Request Link 0 Priority 3 Statistics 



Request Link 0 Priority 4 Statistics 



Request Link 0 Priority 5 Statistics 



Request Link 0 Priority 6 Statistics 



Request Link 0 Priority 7 Statistics 



Request Link 1 Priority 0 Statistics 



Request Link 2 Priority 0 Statistics 



Request Link 3 Priority 0 Statistics 



Request Link 4 Priority 0 Statistics 



Request Link 5 Priority 0 Statistics 



Request Link 6 Priority 0 Statistics 



Request Link 7 Priority 0 Statistics 



Request Link 8 Priority 0 Statistics 



Request Link 9 Priority 0 Statistics 



Request Link 10 Priority 0 Statistics 



Request Link 11 Priority 0 Statistics 



0 


Max Request Link 00 


0 


Num Requests Link 00 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 




0 


0 


0 


0 


0 


0 


Max Request Link 01 


0 


Num Requests Link 01 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 02 


0 


Num Requests Link 02 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 03 


0 


Num Requests Link 03 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 04 


0 


Num Requests Link 04 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 05 


0 


Num Requests Link 05 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 06 


0 


Num Requests Link 06 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 07 


0 


Num Requests Link 07 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 08 


0 


Num Requests Link 08 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 09 


0 


Num Requests Link 09 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 10 


0 


Num Requests Link 10 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Max Request Link 11 


0 


Num Requests Link 1 1 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



Address 
Offset 

0x80F0 

0x80F4 



0x9000 
0x9004 
0x9008 
0X900C 
0x9010 
0x9014 
0x9018 
0x901C 
0x9020 
0x9040 
0x9060 
0x9080 
0x90A0 
0x9 0C0 
0x90E0 
0x9100 
0x9120 
0x9140 
0x9160 
0x9180 
0x9184 
0x9188 
0x9 18C 
0x9190 
0x9194 
0x9198 
0x919C 
0x91 AO 
0X91A4 
0X91A8 
0x91 AC 



% «... 



irt c\r\r\r\ 

/3<? 



Propriett^fnd Confidential Information ofOnex ComnvS^lions Corporation 




15.1.2.1 Grant Mapper Ram 



31 



28 27 



24 23 



20 19 



16 15 



12 11 



0 7 



Address 
0 Offset 



Grant Link 00 


Grant Link 01 


Grant link 02 


Grant Link 03 


Grant Link 04 


Grant Link 05 


0 


0 


0 


0 


0 


0 


0 


0 


Grant Link 06 


Grant Link 07 


Grant Link 08 


Grant Link 09 


Grant Link 10 


Grant Link 11 


0 


0 


0 


0 


0 


0 


0 


0 



0x0000 
0x0004 



Bits 31-8 are read /writable, bits 7-0 read back zero always. Upon reset the registers will 
contain random patterns and must be written to. 

Bit Coding is: 

0000 - Idle 

0001 - GEO 

0010 - GE2 

0011 - GE3 

1000 - LOH Framing Pattern 

1001 - LOH Status 
1010 -LOH ID 
1011 - LOH Stuff 
1100 - LOH Sync 

15.1.2.2 Grant DeMapper Ram 



31 30 



28 



26 



24 23 22 



20 



18 



16 15 14 



12 



10 



8 7 



Address 
0 Offset 



0 


Grant Link 
00 


0 


Grant Link 
01 


0 


Grant Link 
02 


0 


Grant Link 
03 


0 


Grant Unk 
04 


0 


Grant Unk 
05 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Grant Link 
06 


0 


Grant Link 
07 


0 


Grant Unk 
08 


0 


Grant Link 
09 


0 


Grant Unk 
10 


0 


Grant Unk 
11 


0 


0 


0 


0 


0 


0 


0 


0 



0x4000 



0x4004 



Reset value is random, they must be programmed before use. 
Bit Coding is: 

000 - Idle 

001 - GEO 

010 - GE2 

011 - GE3 

100 - LOH Capture 

15.1.2.3 Grant Link Status Register 



31 




29 


24 


23 


22 16 


15 


14 8 


7 


6 0 


fifb 
fill 


0 


Current Fifo 
Watermark 


0 


Grants Received Last Row 


0 


Grants Dropped Last Row 


0 


Grants Forwarded 
Last Row 



Address 
Offset 

0x8000- 
0x802C 



This is a read only register, updated at the end of every row. 
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15.1.3 Grant Link Framing Pattern 



31 



27 



24 23 



16 15 



8 7 



Address 
0 Offset 















Framed 


0 


0 


0 


0 


Framel 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 



0x8030 
0x8034 



The grant link framing pattern is 36 bits wide and is made by concatenating frameO with 
framl such that: FramingPatternI35:0I = Framel[27:24I.Frame0l31:0],. The reset value of Frame 0 is 
0x0000F628 and Frame 1 is 0. So that the framing pattern is 0x0F628. 

15.1.3.1 Grant Link Stuff Register 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



Grant Stuff Register 



0x8038 



This is a 32 bit read/write register. Reset value is all zeros. 
15.1.3.2 Grant Link Overhead Status Register 



31 



24 23 



16 15 



8 7 



Address 
0 Offset 



Grank Link Overhead Status Programmable Register 



Ox803C- 
0x8068 



This is a 32 bit read /write register. All bits are writeable. The reset value is 0. The use of this 
register is still undefined. 

15.1.3.3 Maximum Grants Register 

Address 



31 24 23 16 15 8 7 0 


Link 00 


Link 























0x806C 
0x8070 
0x8074 



The reset value of these registers allows grants up to 96 PDUs for the following row. Writes to 
this register are effective immediatly. 

15.1.3.4 Grant Capture Interrupt Status Register 



31 



24 23 



12 11 



Address 
0 Offset 



reserved 



Unk Interrupt Source 



0x8078 



rows. 



Link Interrupt Source - bit is active hi if contents of capture slot are different than previous 



Bit 0: Link 00 
Bit 1: Link 01 
Bit 2: Link 02 



/yo 
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Bit 11: Link 11 
15.1.3.5 Grant Parity Error Masks/ Grant Config 



31 



24 23 



12 11 



Address 
0 Offset 



Grant Config 



reserved 



Grant Parity Error Masks 



0x80DC 



Grant Parity Error Masks - bit is active hi to enable parity conformance. Reset value is 0. 
Bit 0: Link 00 
Bit 1: Link 01 
Bit 2: Link 02 

Bit 11: Link 11 

Grant Config 

Bit 7: Grant Rotate Enable: Reset Value is 0, Set to a 1 to enable grant parser rotation. 
Bit 6: Disable Num Field (grants are forced to single PDU reservation mode). Default is 0. 

15.1.3.6 Grant Error Interrupt Mask 



31 



24 23 



12 11 



Address 
0 Offset 



Grant Error Interrupt Mask 



OxBOEO 



Bit 31: Grant Start Signals Unaligned Error Mask: Reset is 0. Program to 1 to enable this type 

of error. 

Bit 30: Grant Minimum Start Pulse Error Mask: Reset is 0, Program to 1 to enable this type of 
error. This error mask is ineffective for grants since there is no minimum pulse period (it is a holdover 
from the request parser). 

Bit 29: Grant Remaining Mask: At the end of a row, if grants are remaining in the buffers, this 
will trigger an interrupt when set to a 1 , reset value is 0. 

Bit 28: Grant Fifo Filled Mask: If the grant fifo overflows for a link this will trigger when 
programmed to a 1, reset value is 0. Fifo watermarks need to be investigated for the link. 

Bits 23-12: Grant Mapper Sequencing Mask: Set to a 1 to allow sequencing errors to generate 
an interrupt. 

Bits 11-0: Grant Demapper Sequencing Mask: Set to a 1 to allow sequencing errors to 
generate an interrupt. 

15.1.3.7 Grant Error Interrupt Status Register 

Address 



31 24 


23 


12 11 


7 


0 


Offset 


Grant Error Interrupt Status 
Register 


reserved 


0x80E4 



Bit 31: gnt_dest_error_int - siganls that a grant element has attempted to goto an output it 
isn't supposed to. Write a 1 to clear this type of interrupt source. 

Bit 30: gnt_remaining_int - signals grants are still in the fifo at the end of the row. Write a 1 
to clear this interrupt source. 



w_.. in c\r\r\r\ 
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Bit 29: gnt_fifo_fiil_int - signals that a grant fifo has overfilled. Write a 1 to clear this inter- 
rupt source. 

Bit 28: gnt_start_min_errorJnt - signals that grant elements have arrived too quickly. Write a 
1 to clear this interrupt source. 

Bit 27: gnt_staxt_align_errorJnt - signals alignment error, write a 1 to clear this interrupt 
source 

Bit 26: gnt_mapper_int - signals to read grant sequencing error register 
Bit 25: gnt_demapper_int - signals to read grant sequencing error register 
Bit 24: gnt_parity_int - signals to read the grant parity error register 



15.1.3.6 Grant Parity Error 



Address 



31 


24 


23 


12 


11 


7 


0 


Offset 










































Grant link Parity Status 


0x80E8 



Bit 11: Link 11 
Bit 10: Link 10 

Bit 0: Link 0 

When a grant parity error is detected, writing a 1 to the bit in this register which is causing 
the interrupt, will clear the interrupt. 

15.1.3.9 Grant Sequence Error 

Address 
Offset 



31 








24 23 


16 








12 


11 


7 


0 


0 


0 


0 


0 


Mapper Sequencing Status 


0 


0 


0 


0 


DeMapper Sequencing Status 



0x80EC 



When a Sequencing Error is detected, this register should be read to determine the input or 
output link which has generated the error. Write a 1 to the offending bit to clear the interrupt. 

Bit 24: Output Link 1 1 



Bit 16: Output Link 0 
Bit 11: Input Link 11 



Bit 0: Input Link 0 
15.1.3.10 Grant Destination Error Element 



Address 



31 


24 23 


16 


12 11 


7 


0 


Offset 


Grant Element Bits 47 • 1 6 


OxSOFO 


Grant Eement Bits 15-0 


reserved 


0x80F4 



When a grant link wiring error is detected, the grant element is captured into a register and is 
made accessible here. This is a ready only register. 



/YSL 
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15.1.3.11 Link X Request Element Priority Statistics Registers 



31 27 


26 23 16 


15 11 


10 7 0 


Address 
Offset 


0 


link 00 Priority 0 Requests Draped 


0 


Unk 00 Priority 0 Request Received 


0x9000 


0 


Link 00 Priority 1 Requests Draped 


0 


Link 00 Priority 1 Request Received 


0x9004 


0 


Link 00 Priority 2 Requests Draped 


0 


Unk 00 Priority 2 Request Received 


0x9008 


0 


Link 00 Priority 3 Requests Draped 


0 


Unk 00 Priority 3 Request Received 


0x900C 


0 


Link 00 Priority 4 Requests Draped 


0 


Unk 00 Priority 4 Request Received 


0x9010 


0 


Link 00 Priority 5 Requests Draped 


0 


Unk 00 Priority 5 Request Received 


0x9014 


0 


Link 00 Priority 6 Requests Draped 


0 


Unk 00 Priority 6 Request Received 


0x9016 


0 


Unk 00 Priority 7 Requests Draped 


0 


Unk 00 Priority 7 Request Received 


0x901 C 



This pattern repeats for each of the 12 links, the offset is 0x20 between different link's 
statistics registers. These registers are read - only. 

15.1.3.12 Link X Request Element Counters 

Address 



31 


27 26 24 


23 


16 


15 
















7 














0 


Offset 


0 


Link 00 
Maximum Requests / Row 


0 


Unk 00 
Num Requests Forwarded 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0x9160 



The Maximum Requests per row field has a reset value of 96 and is read-writeabie. The 
number of requests forwarded refers to the previous row and is read-only. 

15.1.4 Miscellaneous Registers 



Base Address: 0x450000 



31 



24 23 



16 15 



8 7 



0 


0 


0 


0 


HinK_mask0O 


ilink_spare00 


0 


0 


0 


0. 


0 


stage 
numOO 


0 


0 


0 


0 


i1ink_mask01 


Hink_spare01 


0 


0 


0 


0 


0 


stage 
num01 


0 


0 


0 


0 


Hink_mask02 


ilink_spare02 


0 


0 


0 


0 


0 


stage 
num02 


0 


0 


0 


0 


ilink_mask03 


ilink_spare03 


0 


0 


0 


0 


0 


stage 
num03 


0 


0 


0 


0 


Hink_mask04 


Hink b _spare04 


0 


0 


0 


0 


0 


stage 
num04 


0 


0 


0 


0 


ilink_mask05 


Hink_spare05 


0 


0 


0 


0 


0 


stage 
numOS 


0 


0 


0 


0 


ilink_mask06 


ilink_spare06 


0 


0 


0 


0 


0 


stage 
num06 


0 


0 


0 


0 


ilink_mask07 


Hink_spare07 


0 


0 


0 


0 


0 


stage 
num07 


0 


0 


0 


0 


ilink_mask08 


ilinK-SpareOa 


0 


0 


0 


0 


0 


stage 
numOS 



Address 
Offset 

0x0000 
0x0004 
0x0008 
OxOOOC 
0x0010 
0x0014 
0x0018 
0x0010 
0x0020 



ilOV/^l bgs; . AAA A 
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Base Address: 0x450000 



31 



24 23 



16 15 



8 7 



0 


0 


0 


0 


ilinK_mask09 


Hink_spare09 


0 


0 


0 


0 


0 


stage 
num09 


0 


0 


0 


0 


ilinlemasklO 


ilink_spare10 


0 


0 


0 


0 


0 


stage 
num10 


0 


0 


0 


0 


ilink_mask1 1 


Nink_spare11 


0 


0 


0 


0 


0 


stage 
num11 


Programmable Switch ID Number (r/w) 


fTAP Switch Part Number (read only) 


Switch Revision (read only) 


{TAP Switch Module Reset 
Register 


SPI Page Register Size 


"a. 

CO 


CD 
CO 

a 

CO 


SPL 
MODE 


0 


0 


0 


0 


BootMode 


0 


0 


Ui 

« 
o 


WRST 


SBus Module Full Decode 


SBUS Timer 


Host to Core Push Button Interrupts 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Core to Host Push Button Interrupts 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


0 


Watchdog Control Register 


Watchdog Timeout Register 


Watchdog Timer Value 


Watchdog Service Register 


Interrupt Mask Register 


Interrupt 0 Register 


Interrupt 1 Register 


Interrupt 2 Register 


Interrupt 3 Register 


PI00 


PI01 


PI02 


PI03 


PI04 


PI05 


PI06 


PI07 


PI08 


PI09 


PI10 


PI11 


PI12 


PI13 


PI14 


PI15 


PI16 


PI17 


PI18 


PI19 


PI20 


PI21 


PI22 


PI023 


PI24 


TBD- if we need more interrupts 


































































(TAP Switch Test Mode 



































































































































Address 
Offset 

0x0024 

0x0028 

0x0020 

0x0030 
0x0034 
0x0038 

0x0030 
0x0040 
0x0044 
0x0048 
0x0040 
0x0050 
0x0054 
0x0058 
OxOOSC 
0x0060 
0x0064 
0x0068 
0x006C 
0x0070 
0x0074 
0x0078 



15.1.4.1 Switch Stage Numbers 



31 








27 26 


24 23 


16 


15 


11 10 


8 


7 










0 


0 


0 


0 


0 


ilink_mask00 


ilink_spareOO 


0 


0 


0 


0 


0 


stage 
numOO 



Address 
Offset 

0x0000 
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16 iTAP Debug/Trace Interfaces 

The iTAP Switch will have 2 different visibility points of the local Tensilica processor. The first 
visibility point is a trace port. The trace port allows one to monitor program flow by outputting 
information about the program counter. This is a non-intrusive interface that only monitors 
information. The second port is the JTAG debug port. This is a bi-directional communications 
pathway that allows a programmer to set breakpoints, single step through code, and peek/poke at 
processor state bits as well as memory locations. 

In this document a brief overview of the debug interfaces on commerically available 
microprocessors is given, followed by the iTAP implementation of each. 

16*1 Program Trace / Debug Support of other Processors 

Most processors on the market today have some sort of JTAG on-chip debug support. This 
allows a programmer to interface to the chip and single step through his or her code. Breakpoints can 
be, memory modified and individual processor registers can be set or cleared. Fewer processors have 
a 'trace' port which allows the monitoring of a program. Often, software bugs only crop up when they 
have been running 'at speed' and the insertion of special debug code, or slowing down the system by 
single stepping through code mask the problem. Here is a brief list of what other processor vendors 
provide. 

Motorola - Special 8 bit Debug trace port + on chip debug via JTAG 

ARM - Embedded ICE. This provides JTAG control. 1 

Intel StrongArm - JTAG single step, etc. 

SuperH - JTAG only 

TI DSP - JTAG only 

Analog Devices DSP - JTAG 

MIPS - JTAG. (Some licensed vendors may support more, couldn't find any info). 

Motorola is the only vendor that had debug module which provides for a program trace. This 
is used in their coldfire microprocessors. This is a byte wide interface which is divided into 2 nibbles. 
These two nibbles are the PST and DDATA. 



':il;C::P$Tl?:ol ijjp;: 


'•};.';.■? \ Definition pv : :ijjj;l:.-.:;- 


0000 


Continue Execution 


0001 


Begin Execution of an instruction 


0010 


Reserved 


0011 


Entry into user-mode 


0100 


Begin Execution of Pulse or WDATA 


0101 


Begin execution of a taken branch 


0110 


Reserved 


0111 


Begin execution of RTE inst 


1000 


Begin 1 byte DDATA xfer 


1001 


Begin 2 byte DDATA xfer 


1010 


Begin 3 byte DDATA xfer 


1011 


Begin 4 byte DDATA xfer 


1100 


Exception Processing 


1101 


Emulator mode entry exception processing 


1110 


Processor is stopped, wait for irq 


1111 


Processor is halted 



The shortfalls of this interface are that it uses an extra cycle to start sending program counter 
relative branch information and offers little insight as to why the processor has stalled. On the 'plus' 
side it has a special PULSE mode which can be set by the processor for performance mode. The 
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Tensilica debug interface gives extensive processor stall information. When it has more information to 
send than there is time to send it, the trace port will halt the processor. The iTAP debug interface 
borrows heavily from the Motorola family of processors, with some modifications. 

16.2 iTAP Trace Port 

This trace interface is used to follow a program trace of a Tensilica microprocessor. On the 
Tensilica processor, one of the configuration options is to create a Trace Port. This port gives a great 
deal of information regarding the processor's internal state and when it encounters pipeline bubbles, 
etc. Using it, one can determine the execution flow of a program in real time. Uiifortuiiately, it is a 41 
bit interface. 

The iTAP Debug port is completely non-intrusive to the Tensilica core. It will never halt the 
processor. The debug port has the abilility to allow tracing of the program counter, as well as load/ 
store address or data words. The amount of data output is configurable by the programmer. It is 
output in real time at the internal core rate of the Tensilica processor. 

Since the port is non-intrusive and has little buffering if too many Jumps or load/stores occur 
in a row the data will be lost. Judicious use of which data to output will be necessary. 

16.2.1 Tensilica Trace Port Interface 

SUBJECT TO TENSILICA NDA 
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17 Miscellaneous Modules 

17.1 Test Pattern Generator & Analyzer 

RESERVED 

17.2 Test Interface Bus Multiplexer 

RESERVED 

17.3 System Control 

RESERVED 

17.4 JTAG 

RESERVED 
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18 Clocking, Reset, and Initialization 

18.1 Clocking 

TBD 

18.2 Initialization 

Initialization of the master processor is described here, after its initialization is complete the 
switch datapath, high speed serial links and several other sub modules need to be initialized. Please 
consult their individual design specifcations for correct sequence and initialization registers. 




Figure 18-1: iTSE Initialization & Reset Sequence 

• Reset State 

Power has been applied to the iTSE, hresetx is asserted (active low). The BootDev, BootCfg. 
RiscClkSel and BootRisc pins are driven to the correct values. (See pin description for appro- 
priate values). The choices allow for booting out of the SPI, Flash, FCRAM or Internal Ram. 
Hresetx is deasserted. 

• Blaster (API) Processor Boots 

Either the SPI has been selected OR the host has downloaded code to the external Host 
Expansion Ram / internal Ram and then wrote to the Master API Processor Reset bit. In either 
cas the master processor now goes to its reset vector and begins executing code. This vector is 
fixed at processor generation time. Therefore several instructions will be hardcoded into the 
iTSE memory map at the processor's reset vector which perform a branch to the selected 
memory segment- ie flash, spi, host expansion ram or internal ram. 

• Initial Power on Self Test 

Coming out of reset the watchdog timer will be enabled with a default timeout of X ms. This 
will give the master processor enough time to execute several instructions and update the 
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watchdog timer. 



• Initial Power on Self Test Failure 

This state is reached by either of 2 conditions: 

The master API processor never executes any good code due to either a severe chip manufac- 
turing bug or a circuit board problem- open/ shorts, etc. When this occurs the watchdog timer 
has never been cleared by software and expires. 

The master API procesor begins booting and during one of its tests discovers an error- for 
example during testing its Instruction RAM it detects a stuck bit- it can write to the BootFail 
bit and immediatly assert it or it can let the watchdog expire. 

• Host Processor Assisted Boot 

The Host processor will access the internal memory or Host Expansion FCRam and download 
code to the API processor. It can then set the BootDev and BootCFG pins accordingly and set 
the RiscBoot pin. 

• No Boot 

No code is ever downloaded to the API processor, the iTSE is run in host-only mode. 



19 I/O Definitions and Timing 

20 Packaging and Pinout 

20.1 Chiron Packaging 

The following chart is taken from the IBM SA-27E databook. 



. .1.27 mm pitch 'Laminate BGA. ;!;: : 






still looking for this info. 


42.5 x 42.5 mm: 1088 total balls 




640 


748 


748 


748 


40 x 40 mm: 960 total balls 




640 


664 


664 


664 


37.5 x 37.5 mm: 840 total balls 




576 


576 


576 


576 


35 x 35 mm: 728 total balls 




500 


500 


500 


500 


33 x 33 mm: 624 total balls 


412 


456 


456 


456 


456 








6lm 


5.66 


6.73 


7.86 


9.08 


10.39 



Table 20-1: SA-27E Organic Die-Package Menu: Single Density Footprint Laminate BGA 
The iTAP Switch is initially targeted at the 624 ball 1.27mm pitch BGA package. 

A preliminary pin list for the iTAP switch is given below: 
20.2 Chiron I/O Summary 



y;Furictfo^ 


Signals : :: y 


. ; Number.:! 


Input Links 


LINK IN00 D00_P, _N t UNKJN00_D01_P,_N ( 
LINKJN00_GNTOUT_P,_N 
-12 of these- 


72 



/ 
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Output LinnY 


UNK.OUT00_D00_P._N, UNK_OUT00_D01_R_N, 
UN»eOUT00_GNTIN_R_N 
-12 of these- 


72 


Host Interface Address and Data 


HADDR12-00, HDATA15-00 


29 


Host Interface Control 


H_DSx,H_CSx,H_WEx1-0, H_DSACKx,HJRQ1-0x 


7 


Host Expansion FCRAM 


HEXP_ADDR (15),HEXP_DATA (16) ,HEXP_CNTRL 
(10),HEXP_VREF (2) 


43 


SPI Bus 


SPICLK,SPIMOSI t SPIMISO.SPlCS0,SPICS1 


5 


UART 


TX,RX,D_TX,D_RX 


4 


Trace Port 


TPSTAT[3:0] t TPDATA[3:0],TPCLK 


9 


JTAG 


TCKJMS.TRSx.TDl.TDO 


5 


System Control 


Ref Clock 1 (2), Ref Clock 2 (2), RiscCIock (1) 
PLLA(4), PLLB(4), PLLC (4) 
HRESETx, SRESETx, SOR_SYNC (2), CTM[3:0] 


25 


Misc 


BOOTRISC,BOOTDEV(2), BOOTCFG(2), 
RISC_CLK_SEL, Thermai(2), BOOTFAILx (1) 


9 


Test Access Port 


TIB 


64 


Test Pins 


TE.Z 


2 


Total I/O Pins Assigned 




346 


Total I/O Pins Unassigned 




38 


Total I/O Pins 


Assigned I/O Pins + Unassigned I/O Pins 


384 


Dedicated Power (*1) 


VCC (SSTL), VCC (Core), VCC(l/0), GND 


168 


Unilink Power 


VCC & VSS for Unilink(1.8V), Unilink(2.5V), 


72 


Total Power Pins 


Dedicate Power + Unilink Power 


240 


Total Package Pins 


Total I/O Pins + Total Power Pins 


624 



Table 20-2: iTAP Switch Chip I/O 
*1: These are dedicated power pins on the chip as defined by IBM document #SA1 4-2 180-01 
Laminate Ball Grid Array p.97 Fig. 38 



20.3 Signal Description 

This sections describes the iTAP Switch pinout Active low signals have an 'x* suffix. 

20.3.1 Host Interface 



Signal Name !i; 


:::|Mnembnlc; : b 


; Direction: 


A99. f ft* ;!; 


;;::.;!!!:. ;::/; : jj ^Description 'i:!^ !;■ 


Address Bus 


HADDR[1 2:001 


I 


BT3335 


Host Interface Address bus, word addressable, byte 
writable. It can address16K bytes of internal memory 


Data Bus 


HDATA[15:001 


B 


BT3335 


Host Interface 1 6 bit data bus 


Chip Select 


HCSx 




BT3335.PU 


Host Interface Chip Select 


Data Strobe 


HDSx 


I 


BT3335 


Host Interface Data Strobe 


Data Acknowledge 


DSACKx 


O 


BT3335.PU 


Host Interface 


Write Enable 


HWEx[1:0] 


I 


BT3335 


Host Write Enable 


Interrupts 


HIRQx(1:0] 


0 


BT3335.PU 


Host Interrupts 



Total number of pins: 36 
20.3.2 Host Expansion FCRAM Interface 



Signal Name ; 



Mnemonic 



Direction 



; Description;; 



J So 
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Reference Voltage 


"JcXP.VRef 


0 


VSSTL2R1 


Voltage References (2) 


Address Bus[14:00] 


HEXPJVDDR 


0 


BSSTL2C2 


Address Bus (15 bits) 


Data Bus[15;00] 


HEXP.DATA 


B 


BSSTL2C2 


Data Bus (16 bits) 


Function Select 


HEXP_FN 


0 


BSSTL2C2 


Address / Data Function Pin 


Chip Select 


HEXP.CSx 


0 


BSSTL2C2 


Active low chip select 


Bank Address 


HEXP_BA(1:0) 


0 


BSSTL2C2 


2 Bit Bank Address 


CLK(+/-) 


HEXP.CLK 


0 


BSSTL2C2 


Differential Clock 


Data Mask (Lower Byte) 


HEXP_DML 


0 


BSSTL2C2 


Write Mask (Data Bits 7-0) 


Data Mask (Upper Byte) 


HEXP.DMU 


0 


BSSTL2C2 


Write Mask (Data Bits 15-8) 


Lower Byte Data Strobe 


HEXP.LDQS 


B 


BSSTL2C2 


Data Strobe (Bits 7-0) 


Upper Byte Data Strobe 


HEXPJJDQS 


B 


BSSTL2C2 


Data Strobe (Bits 15-8) 



Total number of pins: 43 
& We have 2 Reference Voltage pins for 42 I/O. Is this enough? 
20.3.3 SPI Bus 



'!r;:::::;;Stgnal^ 








•'^i-iiilijlip:::: 1 . "pescrlptfeni:i:ji:;;i-. : :;^ 


SPI Clock 


SPICLK 


0 


BT3335 


SPI Clock. 


SPI Master Out Slave In 


SPIMOSI 


0 


BT3335 


SPI Data Write 


SPI Master In Slave Out 


SPIMISO 


I 


BT3335.PD 


SPI Data Read 


SPI Chip Select 


SPICSx[1:0] 


0 


BT3335 


SPI Chip Selects 



Total Number of pins: 5 
20.3.4 UART 



j Signal Name 


; : :!;i|Mrtembn1c^ 


Direction 


;il!::;:i::VP.Pad;:j;ii;||lj 




UartTX 


TX 


0 


BT3335 


Transmit 


Uart RX 


RX 


1 


BT3335.PD 


Recive 


Uart Daisy TX 


DTX 


1 


BT3335.PD 


Daisy Chained Transmit 


Uart Daisy RX 


DRX 


0 


BT3335 


Daisy Chained Receive 



Total Number of pins: 4 
20.3.5 JTAG 



jStgnal 'Name :; : ; : ; 










JTAG Clock 


TCK 




BT3335 


JTAG Clock 


JTAG Command 


TMS 


1 


BT333S 


JTAG Command 


JTAG Reset 


TRSx 


I 


BT3335 


JTAG Reset (Active Low) 


JTAG Data In 


TDI 


1 


BT3335 


JTAG Serial Data in 


JTAG Data Out 


TDO 


0 


BT3335 


JTAG Serial Data out 



Total number of pins: 5 
20.3.6 Trace Port 



Signal Name •;!;;! 


Mnemonic 


pfrectfon 


UO Padf;;:i; : ;!!i : 


ij:;;;::;:.;!;!;.:: Description jp^;jjp:: : :>i:iHi 


Trace Port Status 


TPSTAT[3:0] 


0 


BT3335 


Tensilica Program Counter Change 
Status 


Trace Port Data 


TPDATA[3:0] 


0 


BT3335 


Tensilica Program Counter Relative 
Change 
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Trace Port Clock 


v TPCLK 


0 


BT3335 


This is the core clock. 



The speed of these pins = cclk. 250MHz nominal. An investigation is needed to determine 
what drivers should be used for these pins. Initially LVTTL. 

Total number of pins: 9 
20.3.7 Miscellaneous 



ru 

O 
n 













Boot Rise Processor 


BOOTRIoC 


1 

1 


BT3335_PD 


Upon h reset, this bit Is latched to 
determine if the rise core should boot. 
0- No Boot 
1=Boot. 


Boot Device Select 


BOOTDEV[1:0J 




BT3335.PD 


Upon reset, these bits are read to 
determine the method the processor 
should be bootstrapped 

00 = Serial Prom (SPI) 

01 = Internal Shared Ram 
11 = External FCRAM 


Boot Device Config 


BOOTCFG[1:0] 


1 


BT3335.PD 


Depending upon the boot device 
selected, these bits will take on different 
meanings. These are listed below: 




BOOTDEV=00=Serial Prom 
BOOTCFGfl] = Mem Size 
0=16 bit addressing 
1 = 24 bit addressing 
BOOTCFG[0] = SPI Device 

0 a SPI CS 0 

1 = SPI CS 1 


BOOTDEV=01=lnternaI Shared RAM 

BootCFG= unused 




BOOTDEV=11=FCRAM 

BOOTCFG = unused 


Master Processor Clock 
Select 


RISC.CLrCSEL 


1 


BT3335_PD 


Upon reset, this bit is latched to 
determine the clock source of the 
embedded rise core. 
0= Core Clock 
1= Rise Clock 


Power-On Boot Failure 


BOOTFAILx 


0 


BT3335 


Upon a Power On Boot Failure this pin 
will go active hi until reset 


Thermal Diode 


THERMAL! 


1 


THERMAL 


Thermal Diode on the iTAP Switch 
specified @ 
xxxxxxx 


Thermal Diode 


THERMALo 


0 


THERMAL 


Thermal Diode on the iTAP Switch 
specified @ 
xxxxxxx 



Total number of pins: 9 
20.3.8 Switch Fabric Serial Links 



.!!:•* •• Signal Name : . : :: Mnemonic Direction :| :.:l/0 Pad :;: 

Serial Data Links (Inbound Traffic) 


■ : ' ":deacrlptlbn- : ':;:;!;"ii^ii:"::ii!i: 


Link_IN[11:0OLD[1:OLR.N 


UNKJN00_D0_P 




unilink pad 
name ??? 


Input Link Data (Unilink) 


Link_IN(1 1 ;001_GNT_P_N 


UNKJN0O_GNT_P 


O 




Input Link Grant (Unilink) 


Serial Data Links (Outbound Traffic) 



J SSL 
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Link_0UT[1 1:O0J_D(1:0LR_N 


i UNK_OUT00.D0.P 


0 




Output Link Data (Unilink) 


Link_0UTl1 liOOLQNT.R.N 


LINKJDlTTOCLGNT_P 


1 




Output Link Grant (Unilink) 



Total number of pins: 144 
20.3.9 System Control 



Total Number of pins: 25 
20.3.10 Test Access Port 



1 Signal Name .;;! 


Mnemonic 


Direction 






General Purpose Test 


TIB[63:00J 


I/O 


BT3335PDT 


General Purpose Test t/Os 
They wil also be the dedicated IBM test 
pins. 


Test Enable 


TE 


1 


IT33TEPDT 


IBM LSSD Dedicated Test Enable Pin 


HighZ 


Z 


1 


BT3335 


Tie hi to tri-state all t/Os for ATPG 
Testing 



Total Number of pins: 66 
20.3.11 Power Supplies 



:.. Signal Name : :: 


;;;:;!;^Mn"eiTI0riici ;:: ^^ 


Of Pins : ;: 


VCC (Core -1.8V) 


DVCC18 


48 


VCC (3.3V) 


DVCC33 


40 


VSS (3.3)&(1.8) 


DVSS 


80 


UniUnkVCC18 


LVCC18 


18 



ijliiL: ."Sfgnal Name: 1 -;;;:;'!; 


•: ; : !;|Mhem9hjc:!ip: : 


;; Direction 






Ref Clock 1 Input 


PLUV_CLK(+/-) 


1 


IPECLD 


PLL A Clock Input -primary input-** 


Ref Clock 2 Input 


PLLA_CLK(+/-) 


1 


IPECLD 


PLL A Clock Input -primary input-** 


RISC CLK 


RISC.CLK 


1 


LVTTL 


Embedded Processor Clock 


PLL A Feedback 


PLLA.FB 


0 


BT3335 


Look at ibm's pli data book spec. 


PLL A Test Enable 


PLLA.TST 


1 


BT3335PD 


Test Input 


PLL A Analog Power 


PLLA.VDD 


PWR 






PLL A Analog Ground 


PLLAJ3ND 


GND 






PLL B Feedback 


PLLB.FB 


0 


BT333S 


Look at ibm's pil data book spec. 


PLL B Test Enable 


PLLB.TST 


1 


BT3335PD 


Test Input 


PLL B Analog Power 


PLLB_VDD 


PWR 






PLL B Analog Ground 


PLLB.GND 


GND 






PLL C Feedback 


PLLC.FB 


0 


BT3335 


Look at ibm's pil data book spec. 


PLL C Test Enable 


PLLCJTST 


1 


BT3335PD 


Test Input 


PLL C Analog Power 


PLLC.VDD 


PWR 






PLL C Analog Ground 


PLLC.GND 


GND 






Clock Test Mode Pins 


CTM[3:0] 


0 


BT3335 


Clock Test Pins 


Hardware Reset 


HRESETx 


1 


BT3335 


Hardware Reset - resets everything in 
the chip 


Software Reset 


SRESETx 


1 


BT3335 


Software Reset- controlled reset, 
processor, only certain registers are 
reset 


Start of Row 


SOR_SYNC(+/-) 


1 


IPECLD 


System-wide Start of Row Pulse 
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UniUnkVCC25 


LVCC25 


18 


UniUnkVSS18 


LVSS18 


18 


UniLinik VSS25 


LVSS25 


18 



20.3.12 Reference Documents 

IBM SA14-2180-01 Laminate Ball Grid Array 
IBM SA14-2208-02 ASIC SA-27E Databook 

20.4 Package Outline 
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Figure 20-3: Laminate BGA 624 ball, 1.27 mm pitch, dual/single power zone 



21 Electrical Characteristics 

RESERVED 

21.1 I/O Drivers 

RESERVED 

21.1.1 UniLink 

RESERVED 

21.1.2 LVTTL (5V Tolerant I/O) 

RESERVED 

21.1.3 LVTTL 

RESERVED 
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22 Application Guidt/Jnes 

RESERVED 



fU 

Li 

U 

CI 



» «... 



10 r»r*/\rt 



T"»— -»- 1 A1 



